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Preface 


You  are  holding  the  volume  of  extended  abstracts 
of  the  symposium-,  on  Applied  Mathematical 
Programming  and  Modeling  (APMOD93)  in  your  hand. 
The  institutional  organizer  of  the  event  is  the 
Computer  and  Automation  Institute  of  the 
Hungarian  Academy  of  Sciences,  venue:  Budapest, 
Hungary,  date:  January  6-8,  1993. 

The  purpose  of  APMOD93,  as  a  successor  of 
APM0D91,  held  at  Brunei  University,  London,  UK, 
1991,  was  to  provide  a  continuing  foriam  for  new 
achievements  in  computational  mathematical 
programming  and  modeling  and  their  applications 
in  solving  large  and  difficult  real-life 
problems . 

The  organization  of  APMOD93  was  supported  by  many 
enthusiastic  individuals  and  bodies .  The  bulk  of 
the  work  in  preparing  the  scientific  program  was 
done  by  the  members  of  the  International  Program 
Committee . 

This  volume  contains  the  extended  abstracts  of 
the  papers  accepted  for  presentation  and  received 
by  December  1,  1992.  The  papers  appear  by  the 
first  authors,  in  alphabetic  order. 

It  is  our  belief  that  the  present  volume  will 
contribute  to  the  rapid  exchange  of  scientific 
information  in  the  field  of  applied  mathematical 
programming  and  modeling. 


Budapest,  December  1,  1992 


r 


IstvAn  Maros 
Chairman 

International  Program  Committee 

APMOD93 


International  Program  Committee 


Istv4n  Haros  -  Qiairman 

(RUTCOR,  Rutgers  'University,  New  Brunswick, 
NJ,  USA  <S  Hungarian  Academy  of  Sciences) 

Johannes  Bisschop 

(Twente  University,  Enschede,  NL) 

EUidre  Boros 

(RUTCOR,  Rutgers  University,  New  Brunswick, 
NJ,  USA) 

Tilx>r  Csendes 

(Jdzsef  Attila  University,  Szeged,  H) 

Fred  Glover 

(University  of  Colorado,  Boulder,  USA) 

Hamfred  Grauer 

(University  of  Siegen,  D) 

Harvey  Greenberg 

(University  of  Colorado,  Denver,  USA) 

Peter  Hammer 

(RUTCOR,  Rutgers  University,  New  Brunswick, 
NJ,  USA) 

Freerk  Lootsma 

(University  of  Delft,  NL) 

Gautam  Mitra 

(Brunei  University,  London,  GB) 

Andres  Pr6kopa 

(RUTCOR,  Rutgers  University,  New  Brunswick, 
NJ,  USA  &  Edtvos  Loriknd  University,  Budapest, 
H) 

Dave  Shanno 

(RUTCOR,  Rutgers  University,  New  Brunswick, 
NJ,  USA) 

Owe  Suhl 

(Free  University  of  Berlin,  D) 

Stavros  Zenios 

(University  of  Pennsylvania,  Philadelphia, 
USA) 

Hargit  Ziermann 

(Representative  of  IFORS,  H) 


Contents 


Badics,  T.,  Boros,  E. 

Implementing  a  Maximum  Flow  Algorithm:  Experiments  with 
Dynamic  Trees  1 1 

Bajalinov,  E.B. 

On  Decomposition  of  Dual  Variables  in  Linear  Programming 
and  its  Economic  Interpretation  16 

Bajalinov,  E.B.,  Pannell,  D.j. 

GULF:  A  General,  User-friendly  Linear  and  Linear-Fractional 
Programming  Package  21 

Baker,  T.E. 

Graph-Based  Modeling  with  MIMI/G  24 

Barle,  J.,  Grad,  J. 

LPRINT;  LP  Software  Based  on  the  Interior  Point  Method  27 

Bennaceur,  H.,  Plateau,  G. 

Impact  of  Quantitative  Methods  on  Logical  Inference  Problem  31 

Berland,  N.J.,  Wallace,  S.w. 

Improving  Bounds  in  Stochastic  Programs  by  Partitioning 
the  Support  34 

Bianco,  L.,  Blazewicz,  J.,  Dell'Olmo,  P.,  Drozdowski,  M. 

Scheduling  Multiprocessor  Tasks  on  a  Dynamic  Configuration 
of  Dedicated  Processors  36 

Bierwirth,  C. ,  Kopfer,  H.,  Mattfeld,  D.C.,  Utecht,  T. 

PARNET:  Distributed  Realization  of  Genetic  Algorithms 
in  a  Workstation  Cluster  41 

Biro,  M.,  Bor,  A.,  Knuth,  E.,  Remzso,  T.,  Szillery  A. 

Spreadsheet-Based  Model  Building  and  Multiple  Criteria 
Group  Decision  Support  49 

Black,  J.A.,  Seyed-Hosseini,  S.M. 

Traffic  Models  for  the  Economic  Evaluation  of  Private 
Sector  Toll  Roads  and  Tunnels  in  Australia  65 

Boden ,  H . ,  Grauer ,  M . 

OpTiX-II:  A  Decision  Support  System  for  the  Solution  of 
Nonlinear  Optimization  Problems  on  Parallel  Computers  71 

Boros,  E.,  Recski,  A.,  Wettl,  F. 

Linear  Time  Algorithms  for  VLSI  Routing  78 

Chakrapani,  J.,  Skorin-Kapov,  J. 

A  Constructive  Method  to  Improve  Lower  Bounds  for  the 
Quadratic  Assignment  Problem  64 

Chinneck,  J.W. 

Finding  Minimal  Infeasible  Sets  of  Constraints  in 
Infeasible  Mathematical  Programs  92 

Crama,  Y.,  Oosten,  M. 

The  Polytope  of  Block  Diagonal  Matrices  94 

Csaki,  P.,  Csiszar,  L.,  Folsz,  F.,  Keller,  K.,  Meszaros,  C., 

Rapcsak,  T.,  Turchanyi,  P. 

A  Flexible  Framework  for  Group  Decision  Support:  WINGDSS  V3.0  102 


6 


Csendes,  T.,  zabinsky,  Z.B.,  Kristinsdottir,  B.P. 

Global  Optimal  Solutions  with  Tolerances  and  Practical ' 

Composite  Laminate  Design 
Darby-Dowman,  K.,  Rangel  do  Socorro,  M. 

Preprocessing  and  Cutting  Planes  for  a  Class  of  Production 
Planning  Problems  1 1 3 

Darby-Dowman,  K.,  Lucas  C.,  Ilitra,  G.,  Smith,  J. 

Modelling  Strategies  in  Integer  Programming  Applied  to 
Scheduling  a  Fleet  of  Ships  131 

Darby-Dowman,  K.,  Kristjansson,  B.,  Lucas  C.,  Mitra,  G.,  Moody,  S.A. 
Representing  Procedural  Knowledge  within  Mathematical 
Programming  Modelling  System  (MPL)  138 

Dell'Amico,  M. ,  Martello,  S.,  Vigo,  D. 

Lower  and  Upper  Bounds  for  the  Single  Machine  Scheduling 
Problem  with  Eairliness  and  Flow  Time  Penalties  134 

Dorndorf,  U.,  Pesch,  E. 

Combining  Genetic  and  Local  Search  for  Solving  the 
Job  Shop  Scheduling  Problem  142 

Drud ,  A . S . 

Finding  an  Initial  Feasible  Point  in  the  Large  Scale 
GRG  Code  CONOPT  ISO 

Eschenauer,  H.A.,  Weinert,  M. 

Shape  Optimization  of  Complex  Shell  Structures  in  a 
Parallel  Computing  Environment  158 

Ferreira,  C.E.,  Grotschel,  M.,  Martin,  A.,  Weismantel,  R. 

On  Multidimensional  Partitioning  Problems:  Facial 
Structure  and  Applications  166 

Frandsen,  P.E.,  Sorensen,  S.B. 

Contoured  Beam  Reflector  Array  Antenna  Optimization  with 
Dual  Parameters  Types  170 

Freville,  A.,  Hanafi,  S- 

A  Lagrangean  Relaxation  for  Optimizing  a  Discrete 
Production  System  in  Manufacturing  178 

Freville,  A.,  Guignard,  M. 

Relaxations  for  Minimax  Problems 
Friedler,  F,,  Fan,  L.T, 

Combinatorial  Acceleration  of  the  Branch  and  Bound  Search 
for  Process  Network  Synthesis  1®2 

Fulop,  J. 

On  a  Special  Class  of  Linear  Programs  with  an  Addititonal 
Reverse  Convex  Constraint  301 

Gaivoronski,  A. A.,  Messina,  E.,  Sciomachen,  A. 

A  Statistical  Generalized  Programming  Algorithm  for 
Stochastic  Optimization  Problems 


7 


Gassmann,  H.l. 

A  Fast-Start  Algorithm  for  Multistage  Stochastic  Programs  216 

Giannessi,  F. 

Some  Connections  Between  Integer  Programming  and 

Continuous  Optimization  220 

Glover,  F.,  Babayev,  D. 

New  Results  for  Aggregating  Integer-Valued  Equations  224 

Glover,  F.,  Skorin-Kapov,  J. 

Heuristic  Advances  in  Optimization  Integrating  Tabu  Search, 
Ejection  Chains  and  Neural  Networks  234 

Greenberg,  H.J. 

An  Overview  of  the  Development  of  an  Intelligent 

Mathematical  Programming  System  244 

Greenberg,  H.J.,  Murphy,  F.H. 

Equivalence  of  Mathematical  Programs  246 

Gurvich,  V.A. 

Extremal  Integer  Sequences  with  Forbidden  S\ims  247 

Hajian,  M.,  Levkovitz,  R.,  Mitra,  G. 

A  Branch  and  Bound  Algorithm  for  Discrete  Programming  Using 
the  Interior  Point  Method  and  the  Simplex  Method  257 

Hertog  den,  D.,  Kaliski,  J.,  Roos,  C.,  Terlaky,  T. 

A  Logarithmic  Barrier  Cutting  Plane  Method  for  Convex 
Programming  266 

Hertog  den,  D.,  Jarre,  F.,  Roos,  C.,  Terlaky,  T. 

A  Unifying  Investigation  of  Interior-Point  Methods  for 
Convex  Programming  274 

Hurlimann,  T. 

IP,  MIP  and  Logical  Modeling  Using  LPL  282 

Jansen,  B.,  Roos,  C.,  Terlaky,  T. 

On  Vaidya's  Volumetric  Center  Method  for  Convex  Programming  289 
Jarre,  F. 

An  Efficient  Line-search  for  Logarithmic  Barrier  Functions  296 

Jonasson,  K.,  Madsen,  K. 

A  Projected  Conjugate  Gradient  Method  for  Sparse  Minimax 
Problems  304 

Kail,  P.,  Mayer,  J. 

Model  Management  for  Stochastic  Linear  Programming  312 

Keyser  de ,  w . 

Two  Algorithms  for  Solving  Linear  Programs  with  Logical 
Constraints  318 

Kort,  P.M.,  Tapiero,  C.S. 

The  Dynamic  Theory  of  the  Investing  Polluting  Firm  and 
Pollution  Insurance  324 

Kovacs ,  L . B . 

A  Comparative  Study  of  Modelling  and  Problem-Solving  by 
Mathematical  Programming  and  Logic  Programming  329 


8 


Kristjansson,  B.,  Lucas,  C.,  Mitra,  M.,  Moody,  S. 

Modelling  Tools  for  Reformulating  Logical  Forms  into 
Zero-One  Mixed  Integer  Programs  337 

Lasciak,  A. 

Madrid  and  Ramsay  Models  for  Slovak  Economy  345 

Levkovitz,  R.,  Mitra,  G.,  Taiuz,  M. 

Experimental  Investigations  in  Combining  IFM  and 
Simplex  Based  LP  Solvers  353 

Locatelli,  M.  Schoen,  F. 

An  Adaptive  Stochastic  Global  Optimization  Algorithm  for 
One-Dimensional  Functions  374 

Lootsma,  F.A. 

Scale  Sensitivity  and  Rank  Preservation  in  a  Multiplicative 
AHP  and  SMART  382 

Madsen,  0. 

Lagrangean  Relaxation  and  Optimal  Solutions  to  Time  Window 
Constrained  Vehicle  Routing  Problems  389 

Mans,  B.,  Mautor,  T.,  Roucairol,  C. 

Recent  Exact  and  Approximate  Algorithms  for  the  Quadratic 
Assignment  Problem  395 

Maros ,  I . ,  Meszaros ,  C . 

A  Numerically  Exact  Implementation  of  the  Simplex  Method  403 

Martello,  S.,  Toth,  P. 

The  Bottleneck  Generalized  Assignment  Problem  410 

Melamed,  I. I. 

Neural  Network  Algorithms  for  Combinatorial  Optimization 
Problems:  Theory  and  Experience  ^17 

Meressoo,  T.,  Vaarmann,  0. 

Numerical  Solution  of  Certain  Decomposition  - 

Coordination  Problems  in  Nonlinear  Programming  422 

Minoux ,  M . 

Some  Large  Scale  LP  Relaxations  for  the  Graph  Partitioning 
Problem  and  Their  Optimal  Solutions  ^30 

Neck,  R. 

Optimal  Budgetary  Policies  under  Uncertainty:  A  Stochastic 
Control  Approach  ^34 

Nygard,  K.E.,  Ficek,  R.K. 

Genetic  Search  in  Multi-Depot  Routing  436 

Padberg,  M.,  Alevras,  D. 

Order  Preserving  Assignments  438 

Pinar,  M.C.,  Zenios,  S.A. 

Nonlinear  Min-max  Optimization  via  Smooth  Penalty  Functions  449 

Pinter,  J.,  Meeuwig  Dirk  J.,  Meeuwig  Jay  H.,  Fels,  M.,  Lycon,  D.S. 

The  ESIS  Project:  An  Intelligent  Decision  Support  System  for 
Assisting  Industrial  Waste  Management 


9 


Ponnambalan,  K.,  Vanelli,  A. 

Forcing  a  Vertex  Optimal  Solution  in  Interior  Point  Method 
Using  an  Auxiliary  Function  ^63 

Prekopa,  A.  Li,  W. 

Solution  of  and  Bounding  in  a  Linearly  Constrained 
Optimization  Problem  with  Convex  Polyhedral  Objective  Function  471 
Proll,  L..G.,  Salhi,  A.,  Insua,  O.R. 

A  Parallel  Implementation  of  a  Framework  for  Sensitivity 
Analysis  in  MCOM 

Proll,  L.G.,  Salhi,  A.,  Insua,  D.R.,  Jimenez,  J.I.M. 

A  Comparison  of  Two  Stochastic  Algorithms  for  Global 
Optimisation 

Queyranne,  M.,  Spieksma,  F.,  Tardella,  F. 

A  General  Class  of  Greedily  Solvable  Linear  Programs 


475 

483 

489 


Rendl ,  F . 

Eigenvalues  and  Graph  Partitioning  Bounds 
Rote,  G.,  Vogel,  A. 

New  Heuristics  for  Decomposing  Traffic  Matrices  in  TDMA 
Satellite  Communication 


Rudolph,  G. 

Parallel  Simulated  Annealing  and  its  Relation  to 
Evolutionary  Algorithms 
Salhi,  A.,  Lindfield,  G.R.,  Proll,  L.G. 

The  Role  of  Duality  in  the  Efficient  Use  of  a  Variant 
of  Karmarkar's  Algorithm 
Serin,  Y.,  Avsar,  Z.M. 

Markov  Decision  Processes  with  Restricted  Observations:  Finite 
Horizon  Model 
Skorin-Kapov,  D. 

On  the  Core  of  the  Minimum  Cost  Steiner  Tree  Games 
Smith,  J.M.G.,  Chikhale  N. 

Buffer  Allocation  for  a  Class  of  Nonlinear  Stochastic 
Knapsack  Problems 
Sonnevend,  G. 

Construction  and  Implementation  of  a  Central  Path  Following 
Algorithm  for  Semifinite  Convex  Programs  in  Matlab 
Strayer,  H.J.,  Colbourn,  C.J. 

Bounding  Network  Reliability  Via  Surface  Duality 
Sural,  H.,  Koksalan,  M.,  Kirca,  0. 

A  Location-Distribution  Model  for  a  Beer  Brewer 


508 

516 

521 

529 

537 

541 

549 

559 


10 


T.  de  Araujo,  H.M. 

Discrete  Dynamic  Programming  in  Environmental  Optimisation  562 
Tamiz,  M.,  Jones,  D.F.,  El-Darzi,  E. 

A  Review  of  Goal  Programming  and  its  Applications  573 

Terzi,  A.T.,  Erkip,  N. 

Multi-Stage  Economio^  Dot  Scheduling  Problem  580 

Vial,  J.P. 

Computational  Experience  with  a  Primal-Dual  Interior  Point 
Method  for  Smooth  Convex  Programming  583 

Vizvari,  B.,  Derair,  R. 

It  Is  Difficult  to  Find  a  Difficult  Problem  for  Scheduling  of 
Identical  Parallel  Machines  587 

Voss,  S. 

Concepts  for  Parallel  Tabu  Search  595 

Willers,  P.,  Proll,  h..  Wren,  A. 

A  Dual  Strategy  for  Solving  the  Linear  Programming  Relaxation 
of  a  Driver  Scheduling  System  605 

Williams  H.P. 

Deriving  the  Dual  of  an  Integer  Programme: 

Its  Interpretations  and  Uses  611 

Yarrow,  L.-A. 

Obtaining  Minimum-Correlation  Latin  Hypercube 

Sampling  Plans  Using  Discrete  Optimization  Techniques  613 

Late  Papers  621 


Implementing  a  Maximum  Flow  Algorithm: 
Experiments' with  Dynamic  Trees 
(Extended  Abstract) 


T.  Badics  and  E.  Boros  * 
November  19,  1992 


1  Introduction 

In  this  paper  we  report  on  an  implementation  of  a  maximum  flow  algorithm  by 
Cheriyan  and  Hagerup  [1].  Our  aim  was  to  test  the  behavior  of  this  algorithm 
in  practice,  concerning  it’s  good  theoretical  worst  case  bound.  We  were  particu¬ 
larly  interested  in  the  effect  of  using  theoretically  well  behaving  data  structures 
such  as  dynamic  trees  [11],  and  Fibonacci  heaps  [13].  We  also  made  comparisons 
to  two  preflow-push  based  algorithm  by  Goldberg  and  Tarjan  [10],  and  to  an 
implementation  of  pushes  along  several  edges  without  using  dynamic  trees. 


2  Basic  notions  and  the  PLED  algorithm 

We  assume  that  the  reader  is  familiar  with  the  generic  maximum  flow  algorithm 
in  [10]  and  refer  to  [10]  for  definitions  of  the  terms  network,  source  s.  sink  t,  edge 
capacity  c{v,w),  How,  maximum  flow,  preflow  f,  flow  excess  e{v)  of  a  vertex  v, 
residual  graph,  residual  capacity  rescap(v,  u>)  of  the  edge  (v,  w),  valid  labeling  d, 
active  vertex,  push,  saturating  push,  and  nonsaturating  push.  Let  G  =  (V,  E) 
denote  the  digraph  (assumed  symmetric)  corresponding  to  the  network.  Let 
N  =  \V\,  M  =  \E\. 

Our  implementation  is  based  on  the  algorithm  developed  by  Cheriyan  and 
Hagerup  (see  [1]).  Following  [1]  we  shall  refer  to  this  algorithm  as  PLED  (short¬ 
hand  for  Prudent  Linking  and  Excess  Diminishing).  This  algorithm  is  an  in¬ 
stance  (with  one  minor  exception)  of  the  generic  preflow  algorithm  by  Goldberg 
and  Tarjan  [10].  PLED  also  uses  an  idea  introduced  by  Ahuja  and  Orlin  [4] 

*The  second  author  is  supported  in  part  by  the  Office  of  Naval  Research  (Grants  N00014- 
92-J1375  and  N00014-92-J4083). 
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of  scaling  the  volume  of  the  pushes.  The  scaling  factor  plays  here,  however,  a 
slightly  different  role:  the  limits  imposed  on  the  volume  of  a  push  are  not  the 
same  as  in  [4].  A  third  idea  in  PLED  is  randomization:  after  each  relabeling  of 
a  vertex  w,  the  edgelist  of  t;  is  permuted  randomly.  This  random  permutation 
ensures  a  better  theoretical  cunning  time.  Noga  Alon  showed  in  [3]  that  this 
rauidomization  can  be  replaced  by  a  deterministic  procedure.  The  worst  case 
running  time  of  PLED,  using  the  randomized  procedure  is  0{NM+N^(logN)^) 
with  high  probability  (see  [2]).  Using  Alon’s  derandomization,  the  deterministic 
worst  case  bound  improves  to  0{N M  -I-  N^^^{logN)).  The  worst  case  bound 
without  randomization  is  0(N Mlog{N)). 

Three  main  data  structures  are  essential  for  PLED. 

•  Ordinary  heap  that  contains  vertices  which  have  big  excesses  and  which 
are  ordered  by  their  distance  labels.  This  structure  supports  the  easy 
selection  of  a  vertex  for  a  push.  (Select  a  vertex  with  the  minimal  distance 
label  among  the  vertices  having  large  enough  excesses). 

•  Fibonacci  lieap(see  [13])  contains  the  rest  of  vertices,  ordered  by  their 
(small)  excesses.  This  supports  constant  (amortized)  time  decrease  key 
operation  and  fast  update  of  the  scaling  factor. 

•  Dynamic  trees  structure  (see  [11])  to  maintain  a  spanning  forest  F 
of  G  containing  a  subset  of  the  current  edges,  where  the  value  associated 
with  an  edge  in  F  is  its  residual  capacity.  This  structure  is  able  to  send 
flow  value  along  a  path  of  length  L  in  (amortized)  time  0{logL) 


3  Implementation 

Since  the  algorithm  requires  the  above  data  structures  to  achieve  the  theoreti¬ 
cally  best  performance,  we  decided  to  implement  all  of  them. 

Beside  the  above  data  structures,  we  implemented  a  routine  for  randomly 
permuting  the  edge  lists  of  the  vertices.  Although  the  deterministic  permuta¬ 
tion  of  Alon  derandomizes  the  algorithm,  the  overhead  of  such  a  permutation 
generating  procedure  is  so  large  that  we  did  not  expect  much  improvement 
by  implementing  such  a  deterministic  procedure.  Moreover  since  our  instances 
were  mostly  randomly  generated,  after  some  preliminary  experiments  we  used 
the  PLED  algorithm  without  random  permutations. 

For  comparison  reasons  we  coded  Goldberg’s  simple  preflow-push  algorithm 
using  simple  FIFO  queue  for  selecting  the  active  vertices,  and  the  dynamic  trees 
version  of  this  algorithm  described  in  the  same  paper  ([10]).  Later  we  refer  to 
these  codes  as  GOLD  and  GOLDYN  respectively.  In  the  code  GOLDYN  we 
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did  not  control  the  size  of  the  dynamic  trees.  Thus  we  got  a  code  which  has 
theoretical  worst  case  bound  O(NMhgN)  instead  of  0{NMlog{N^/M).  The 
reason  for  this  choice  was  mostly  lack  of  time  and  the  expected  overhead  of  such 
a  control  mechanism.  Besides,  in  the  range  of  examples  we  tested  the  codes, 
the  benefit  from  such  size  control,  even  without  the  overhead,  probably  is  not 
significant. 

For  testing  the  performance  of  the  dynamic  trees  data  structure,  we  im¬ 
plemented  its  operations  {find-min,  add-value,  find-root,  Jink,  cut,  etc.)  with 
storing  the  trees  explicitly  and  executing  these  operations  in  the  obvious  (linear 
time)  way.  Hence  we  avoided  the  overhead  of  handling  Splay  trees  and  compli¬ 
cated  updates.  Later  in  this  paper  the  codes  using  these  “non-dynamic  tree” 
operations  are  called  NPLED  and  NGOLDYN.  Note  that  the  number  of  ele¬ 
mentary  operations  for  PLED  and  NPLED  (or  GOLDYN  and  NGOLDYN)  are 
the  same  on  the  same  instance,  only  the  way  of  handling  the  tree  operations  are 
different.  Therefore  the  difference  in  running  time  shows  exactly  the  imp8w:t  of 
the  dynamic  trees  structure. 

In  our  implementations  of  all  the  codes  we  employed  an  idea,  mentioned  in 
([10]),  the  so  called  “global-”  or  “big-relabeling”.  Our  early  experiments  showed 
clearly,  that  in  PLED  just  like  in  GOLD  or  GOLDYN,  the  running  times  of  the 
variauit  which  uses  global-relabeling  were  much  smaller  (orders  of  magnitude) 
than  the  one  which  does  not  use  it.  Therefore  we  built  in  some  heuristic  param¬ 
eters  controlling  the  calling  frequency  of  big-relabeling  and  affecting  thus  the 
running  time. 

A  “big-relabeling”  step  consist  of  two  breadth-first-searches,  one  starting 
from  the  sink  and  working  on  the  sink  side  of  the  residual  graph,  and  another 
one  for  the  source  side,  starting  from  the  source.  In  these  breadth-first-searches 
the  shortest  distance  is  calculated  from  each  vertex  to  the  sink  on  the  sink  side, 
or  to  the  source  on  the  source  side,  respectively.  Unfortunately  breadth-first- 
search  is  a  relatively  expensive  operation  (it  takes  0(M)  steps),  so  the  calling 
frequency  of  big-relabel  is  very  important  and  can  be  a  subject  of  later  studies. 

We  implemented  another  mechamism  to  achieve  better  running  times  in  all 
three  codes.  Namely  at  initialization  the  algorithm  calculates  an  upper  bound 
[/  on  the  maximum  flow  value  by  taking  the  minimum  of  capacities  of  some 
cuts.  Then  it  creates  a  new  source  by  adding  an  artificial  vertex  S  and  a  new 
arc  (5,  s)  with  capacity  U  to  the  network,  where  s  was  the  old  source.  The 
new  problem  is  obviously  equivalent  with  the  old  one,  and  the  extra  cost  of  its 
implementation  is  negligible.  The  advantage  of  doing  this  is  that  we  do  not  let 
the  algorithm  push  too  much  excess  into  the  network,  reducing  in  this  way  the 
runtime  of  the  second  phase.  We  have  found  instances  showing  that  without 
this  procedure  the  running  time  was  significantly  bigger  due  to  the  long  second 
phase. 
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4  Experimental  results 

For  the  experiments,  we  used  the  DIMACS  suggested  problems,  and  the  gen¬ 
erators  GENRMF,  WASHINGTON,  and  AC-MAX  [6,5,7].  (See  the  DIMACS 
document  “The  Core  Experiments”).  The  families  of  networks  we  report  on 
include  the  ones  suggested  by  “The  Benchmark  Experiments” ,  and  two  classes 
of  problems  made  intentionally  very  difficult  for  Goldberg’s  preflow-push  algo¬ 
rithm. 

Ail  the  experiments  were  carried  out  on  a  Sun  Sparc  1-f  Workstation  under 
UNIX  operations  system. 

5  Conclusions 

Summarizing  our  work,  we  can  conclude  that  although  the  PLED  algorithm 
has  a  very  good  theoretical  worst  case  bound,  in  practice  Goldberg’s  simple 
preflow-push  algorithm  outperforms  it  on  most  of  the  examples  of  this  study. 

Our  study  shows  that  the  structure  of  the  networks  is  the  most  important 
factor  in  ranking  the  algorithms.  One  such  parameter  to  be  considered,  reflect¬ 
ing  the  structure  of  the  network,  could  be  the  relative  distance  between  the 
source  and  the  sink. 

In  this  study  we  were  particularly  interested  in  the  effectiveness  of  dynamic 
trees.  Our  experiments  show  clearly  that  there  are  families  of  problems  for 
which  dynamic  trees  improved  the  performance  of  our  code  at  a  small  cost.  To 
determine  the  properties  of  network  classes  on  which  the  algorithms  GOLDYN 
or  PLED  are  the  best  would  be  an  interesting  topic  of  later  works. 

Let  us  remark  finally  that  Fibonacci  heaps  did  not  help  much  in  these  exam¬ 
ples.  They  did  not  improve  neither  the  running  time  nor  the  number  of  selection 
steps  of  PLED. 
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Consider  the  following  linear  programming  (LP)  and  linear  fractional  programming 
(LFP)  problems 


maxP(i), 

xes 

max£>(i), 


(1) 

(2) 

(3) 


where  Q{x)  =  P{x)/D{x),  P{x)  =  pjX,  +  po,  D{x)  =  djXj  +  do  >  0 
for  all  a:  €  5  =  {i  6  /2"  :  i4i  <  6,  z  >  0},  Ais  m  x  n  matrix,  i.e.  A  =  l|a,;||mxnt 
z  =  (zi, xj,  •  •  • , z„)^,  b  =  (6i, 6j, •  •  • , a,-^,  6<,  pj,  dj  are  scalar  constants  and 
T  denotes  the  transpose  of  a  vector.  Assume  that  the  feasible  set  S  is  non-empty 
and  P(z),  D{x)  and  Q{x)  are  not  constant  on  5. 


Let  the  basis  feasible  solution  which  msocimizes  the  objective  function  D(x)  be  vec¬ 
tor  z*  =  (zjjzj, •  •  •  ,zj|„0,0, •  •  •  ,0)^.  Our  aim  now  is  to  show  that  for  any  op¬ 
timal  solution  of  problem  (2)  we  can  find  such  vector  p  =  (po,Pi,- "  ^Pn)  that 
z*  is  optimal  solution  of  LP  problem  (1)  and  LFP  problem  (3).  Further,  let 
B  =  (i4x,  Aj,  •  •  ■ ,  Am)  be  the  optimal  basis  associated  with  the  positive  variables, 
where  Aj  =  (oij,  ay,  ,  amj)^  is  j-th  column  vector  of  matrix  A.  Because  the  basis 
vectors  are  linearly  independent  we  have 


Aj  —  ^  ^  AiXij,  j  —  1, 2, . . . ,  n, 

»=i 

and  we  use  these  coefficients  z,j  to  define  the  following 

A'?  =  YlT-l  ~ 

A^(z*)  =  £>(z*)a;  -  p(z*)a;^ 


;■  =  1,2,.  ..,n. 


(4) 
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Further,  the  values  Aj(i’)  can  also  be  put  in  the  form 

m 

-PiD{x’)  -poA",  ;■  = 

1=1 

where  Rij  =  D{x*)xij  —  A'-x’,  i  =  1, 2,  ■  •  • ,  m,  j  =  1 , 2,  •  •  • ,  n. 
Because  the  vector  i*  is  an  optimal  solution  of  (2)  we  have,  [1], 


A" 


—  0.  j  —  1,2,.. .,m, 

>  0.  j  =  m  +  1 ,  m  +  2, . . . ,  n. 


(5) 


As  in  [1]  and  [2]  the  basis  of  LP  problem  (1)  and  LFP  problem  (3)  is  optimal  in 
original  form  if  A'  >  0  for  all  j  and  Aj(x*)  >  0  for  all  j  respectively  but  we  require 
only  to  consider  j  =  rn  +  1 ,  ru  +  2,  •  •  • ,  n  because 

A'  =  A,(x*)  =  0  J  =  l,2,  --,m. 


The  corectness  of  the  following  assertion  is  obvious. 


Theorem  1  If  vector  p  =  (po,Pi,  •  •  •  ,Pn)  satisfies  the  conditions 


Zili  Pi^ij  -  Pi  ^  O' 

Eili  Pi^j  -  P}D{x')  -  PoA'l  >  0, 


j  =  m  +  l,m  +  2...,n. 


then  x’  is  an  optimal  solution  of  LP  problem  (1)  and  LFP  problem  (3). 


(6) 


We  denote  the  set  of  vectors  p  which  satisfy  the  inequalities  (6)  by  H.  It  is  obvious 
that  /f  7^  0.  Indeed,  if  p  =  Ad  where  A  >  0  and  d  =  (do,di,  •  •  •  ,d„),  then  using  (4) 
and  (5)  we  get 


a;  =  AA"  >  0 

A,(i*)  =  £)(i*)(AA'')  -  (A£>(i*))A"  =  0 


j  =  m  +  l,m  +  2. . .  ,n. 


It  means  that  set  H  contains  at  least  the  ray  Ad  where  A  >  0  . 
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Let  us  now  consider  the  dual  problems  corresponding  to  primal  problems  (1),  (2) 
and  (3)  respectively,  [1]  and  [3], 

Minimize  ip{u)  =  JZILi  +  Po 

subject  to  Sill  ^  Pi.  ;  =  1, 2,  •  •  • ,  n  (7) 

u,-  >  0,  t  =  1,2,  --  ,m  , 

Minimize  <l>(v)  =  SHi  ^  '\ 

subject  to  Sill  ^  ‘^i.  J  =  1, 2,  •  •  • ,  n  [  (8) 

w,-  >  0,  i  =  1,2, •  •  •  ,m 

Minimize  V’(y)  =  Vo 

subject  to  doyo  -  Sili  ^  Po. 

f^iVo  d-  Eili  «.iy«  ^  Pi.  J  =  1 , 2,  •  •  • ,  n 

yi>0,  t  =  l,2,--,m 

The  next  theorem  indicates  an  important  relationship  between  the  optimal  solutions 
of  these  problems. 

Theorem  2  If  LP  problems  (1),  (2)  and  LFP  problem  (3)  have  at  least  one  common 
non-degenerate  optimal  solution  z",  then  the  following  decomposition  takes  place 

«‘ =  y*  +  i  =  l,2,...,m,  (10) 

where  u*  =  (tij,  uj, . . .  V  =  ivl,vl,. . .  ,v;^),  y*  =  are  optimal 

solutions  of  dual  problems  (7),  (8)  and  (9)  respectively. 

Proof.  Suppose  that  vector  z*  is  a  common  non-degenerate  optimzd  solution  of  (1), 
(2)  amd  (3).  Let  us  replace  the  ib-th  element  6*  of  vector  6  by  ijt  -t-  c.  Here  and 
in  what  follows  this  replacement  is  claimed  to  effect  no  change  in  the  basis  of  the 
optimal  solution.  In  accordance  with  LP  theory  [1],  for  the  new  optimal  solution 
z'  =  (z'„  z'j,  •  •  • ,  iJn,  0, 0,  •  •  • ,  0)^  we  have 

P(z')  =  P(x-)-heuJ, 

D{x')  =  D{x') -k  evl 


(11) 

(12) 
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Analogously,  in  accordance  with  [4]  for  LFP  problem  (3)  we  get 

This  equation  can  be  written  as 

P{x')  =  Qix-)D{x')  +  eyl 

A  comptirison  of  the  latter  with  (11)  makes  us  infer  that 

P{x')  +  eul  =  Q(x')D(x')  +  ey;. 

Making  use  of  equation  (12)  in  the  latter  we  find 

=  eyi  +  Q(x‘)£v;. 

It  means  that  the  decomposition  (10)  is  correct. 

Let  us  now  focus  on  the  economic  interpretation  of  the  results  described  above. 
Let  a  certain  company  manufacture  n  differend  kinds  of  a  certain  scarce  product. 
Further,  let  pj  be  the  profit  gained  by  the  company  from  a  unit  of  the  j-th  kind  of  the 
product,  po  be  some  constant  profit  gained  whose  magnitude  is  independent  of  the 
output  volume,  6,  be  the  volume  of  some  resource  t  available  to  the  company  and  a,j 
be  the  expenditure  quota  of  the  t-th  resource  for  manufacturing  a  unit  of  jf-th  kind  of 
the  product.  Denote  the  unknown  output  volume  of  some  j-th  kind  of  the  product 
by  Xj.  If  D(x)  is  a  total  output  of  the  product,  then  problem  (2)  corresponds  to 
the  economic  interests  of  the  consumers.  If  the  company’s  aim  is  maximization  of 
its  profit  P(x)  and/or  production  efficiency  Q(x)  calculated  as  a  profit  gained  from 
a  unit  of  output,  then  problems  (1)  and  (3)  correspond  to  the  company’s  economic 
interests.  Suppose  that  vector  i*  maximizes  an  output  function  D(x)  on  the  feasible 
set  5,  i.e.  x*  is  the  best  output  plan  from  the  customers’  point  of  view.  If  the  profit 
vector  p  satisfies  the  conditions  (6),  then  vector  x*  meiximizes  the  company’s  profit 
P(x)  as  well  as  production  efficiency  Q(x).  It  means  that  to  maximize  its  profit 
and/or  production  efficiency  the  company  ought  to  organize  its  manufacturing  in 
accordance  with  an  output  plan  x*  which  conforms  to  the  economic  interests  of  the 
consumers  in  the  best  way.  In  this  case  we  will  say  that  the  economic  interests  of 
the  company  conform  to  the  economic  interests  of  the  consumers. 

Further,  let  the  economic  interests  of  the  compamy  conform  to  those  of  the  consumers 
and  x*  be  the  optimal  solution  of  (1),  (2)  and  (3).  In  accordance  with  Theorem  2  in 
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this  c^lse  decomposition  (10)  takes  place.  It  is  obvious  that  (10)  may  be  interpreted 
in  the  following  way:  if  the  volume  of  resource  t  increases  by  one  unit,  the  profit  of 
the  company  rises  by  u*  units.  Furthermore,  y‘  units  of  them  are  created  by  more 
intensive  production,  whereas  Q{x*)v*  units  by  more  extensive  production,  where 
u’  is  the  output  increase. 

This  decomposition  may  prove  to  be  useful  if  scarce  resources  are  distributed  among 
producers  in  a  centralized  way.  Indeed,  let  us  suppose  that  the  company  has  made  a 
request  to  be  allocated  certain  extra  units  of  the  i-th  resource.  From  the  customers 
point  of  view  it  would  be  reasonable  to  satisfy  the  request  if  and  only  if  v;  >  0 
because  it  is  the  very  case  when  the  use  of  an  additional  volume  of  the  i-th  resource 
brings  about  an  extra  output  of  the  scarce  product. 

Another  way  of  using  (10)  is  to  use  Q{x')v'  as  extra  charge  for  an  extra  unit  of  the 
i-th  resource.  Indeed,  in  this  case  if  the  use  of  an  extra  unit  of  the  i-th  resource 
does  not  lead  to  an  increase  in  efficiency  and  y’  —  0  then  the  extra  profit  of  the 
company  is  equal  to  zero,  too.  It  means  that  these  extra  charges  will  create  an 
interest  in  increasing  the  use  primcirily  of  a  resource,  whose  index  io  is  defined  from 
the  equation 

io  =  ind  max  y* 

l<i<m 

since  in  this  case  the  extra  profit  is  the  largest.  So  if  these  extra  charges  have  been 
introduced  into  practice  they  will  be  favourable  for  the  intensification  of  production. 
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GULF  is  a  simple  to  use  but  powerful,  menu  driven  linear-fractional  programming 
(LFP)  and  linear  programming  (LP)  package  for  IBM  compatible  MS-DOS  micro¬ 
computers  with  minimum  of  256K  RAM  and  one  floppy  disk  drive. 

The  LFP  problem  solvable  by  the  program  may  be  written  as  follows 


subject  to 


Qix) 


P{x) 

D{x) 


+  Po 

E;=i  d,x,  +  do 


max(min), 


200 

<  (>){=)bi,  i  =  1,2,...,  150, 

j=i 

x,>0,  ;■  =  1,2,...,200, 


(1) 

(2) 

(3) 


where  denominator  D{x)  ^  0  for  all  x  €  5.  5  is  feasible  set  defined  by  the  constraints 
(2)  and  (3). 

GULF  is  centered  around  a  spreadsheed  styled  editor  which  is  used  to  enter  or  edit 
an  existing  problem.  It  operates  similarly  to  an  electronic  spreadsheet  program, 
such  as  Lotus  1-2-3,  Quatro  Pro  or  Excel.  The  commands  are  available  through 
the  slash  (/)  key.  The  user  may  use  several  commands,  among  which  the  Calculate 
command  is  to  calculate  the  optimal  solution.  After  calculating,  GULF  prints  the 
optimal  solution,  retrieves  the  data  file  and  returns  to  the  editor. 

The  program  is  called  by  typing  GULF  [Return].  A  spreadsheet  represantation 
immediately  appears,  in  the  following  format  ; 


Gulf 

Obj.Numer 
Obj.Denom 
Row  1 
Row  2 
Row  3 
Row  4 
Row  5 
Row  6 
Row  7 
Row  8 
Row  9 
Row  10 
Row  11 
Row  12 
Row  13 
Row  14 
Row  15 
Row  16 
Row  17 
Row  18 
Row  19 
GULF  v2.2 


Limit  Col  1  Col  2  Col  3  Col  4  Col5  Col  6 
N 

N  1.00 

L 

L 

L 

L 

L 

L 

L 

L 

L 

L 

L 

L 

L 

L 

L 

L 

L 

L 

L 

Row=-l  Col=0  Aim=MAX  File=C:\DEFAULT.GLF 
Type  <  /  >  for  commands 


The  upper-left  position  is  reserved  for  the  problem  name,  which  you  may  modify 
at  v/ill,  as  well  as  any  of  the  spreadsheet  position.  The  number  1.00  in  the  "Limit” 
column  of  the  ’’Obj.Denom”  row  and  0.00  in  the  other  columns  of  the  row  (zeros 
are  blanked)  are  the  default  values  of  the  objective  function  denominator’s  constant 
term  and  coefficients  respectively.  If  you  retain  these  default  values,  GULF  solves  a 
standart  LP  problem  using  the  objective  function  coefficients  in  the  "Obj.Numer” 
row.  To  solve  a  LFP  problem,  the  "Limit"  value  of  the  "Obj.Denom”  row  must 
be  changed  to  a  value  other  than  1.00  and/or  other  coefficients  of  the  row  must  be 
changed  to  values  other  than  zero. 

A  spreaulsheet  styled  data  editor  includes  a  full  range  of  editing  functions,  is  menu 
driven,  has  a  help  facility  and  gives  informative  error  messages.  There  is  a  first  level 
with  the  Calculate  (solves  LFP  or  LP  problem)  command,  the  Help  (it  leads  the 
user  to  4  help  screens),  the  Alter  command  for  modifying  the  default  parameters, 
the  File  command  for  file  operations,  the  Print  command,  the  Quit  command,  and 
a  [Tab]  command  (which  can  also  be  reached  directly  from  the  editor)  which  leads 


23 


to  matrix  operations  (Delete  row  or  column,  Insert,  Copy,  etc.). 

To  solve  a  LP  or  LFP  problem  GULF  uses  well  known  simplex  algorithm  (1|,  [2].  The 
user  has  the  choice  between  two  solution  methods  :  the  simple  steepest  ascent  and 
the  highest  step  [3]  pivot  selection.  The  second  method  involves  longer  iterations, 
but  may  result  in  less  steps.  The  prackage  includes  an  ideal  feature  for  those  who  are 
learning  about  simplex  method  :  the  facility  to  drop  back  into  the  editor  to  view 
the  matrix  after  each  iteration. 

The  optimal  solution  and/or  problem  matrix  can  be  printed  on  ihe  screen,  on  a 
printer  or  into  a  text  file  on  disk.  Output  includes  levels,  slacks,  shadow  costs  and 
prices  and  range  analysis,  each  of  which  can  optionally  be  suppressed.  After  printing 
output,  GULF  returns  to  the  editor  and  you  can  continue  making  any  changes  to 
the  matrix. 

Standart  MPS  data  format  is  used,  so  data  can  be  exchanged  with  other  LP  packages 
on  mainframe  or  micro.  It  is  possible  to  write  your  own  data  entry  program  which 
interfaces  directly  with  GULF’s  solving  algorithm,  bypassing  the  data  editor. 

With  a  256K  RAM  memory,  the  maximum  size  is  120  nonnegative  variables  and 
80  constraints.  This  maximum  size  increases  to  200  nonnegative  variables  and  150 
constraints  if  you  have  384K  or  more  RAM.  There  is  no  minimum  disk  size  required 
as  GULF  takes  up  only  about  lOOK  of  disk  space.  GULF  is  not  copy  protected. 
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We  describe  the  results  of  a  two-year  research  and  development  project  in  the  area  of 
graph-based  modeling.  The  project  was  funded  by  a  consortium  of  six  industrial  sponsors  and 
was  carried  out  by  Chesapeake  Decision  Sciences,  a  small  U.S.  company  that  specializes  in  the 
development  of  state-of-the-art  software  in  the  area  of  planning  and  scheduling. 

The  graph-based  modeling  developments  served  as  an  extension  of  the  existing  MIMI  (Manager 
for  Interactive  Modeling  Interfaces)  system  which  provides  operations  research,  expert  system, 
interactive  graphics,  and  database  capabilities  for  the  solution  of  complex  industrial  planning 
and  scheduling  problems.  Thus,  the  graph-based  modeling  features  (MIMI/G)  are  intended  to 
provide  support  for  all  aspects  of  the  MIMI  system  including  mathematical  programming  model 
management  and  solution  analysis.  As  opposed  to  programming  speciali^  windowed  inter¬ 
faces,  we  set  out  to  provide  a  generic  graph-based  modeling  language  to  enable  end  users  to 
create  new  interfaces  in  a  few  seconds. 

Graph-based  modeling  in  MIMI/G  uses  a  node/edge  paradigm  in  which  graph  attributes  are 
associated  directly  with  structures  and  data  in  the  MIMI  database.  The  MIMI  database  consists 
of  sets  (ordered  lists)  and  tables  (deBned  on  sets)  and  supports  both  hierarchical  and  relational 
data  m(^els.  Graph  nodes  are  generally  associated  with  objects  or  entries  in  MIMI  sets.  Graph 
edges  are  usually  associated  with  relations  or  entries  in  MIMI  tables.  Graph  attributes  (e.g., 
node  position,  node  size,  node  color,  edge  width,  edge  color,  etc.)  are  associated  with  values  in 
MIMI  tables. 


Graph  Attributes 
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Relations  defined  on  special  MIMI  sets  (called  graph  sets)  are  used  to  specify  the  mapping  of 
database  elements  into  graph  attributes.  The  GRAPH  command,  operating  on  graph  sets, 
generates  the  graph  in  the  X  Window  environment  Once  generated,  graphs  represent  a 
one-to-one  relationship  with  the  underlying  data;  a  change  to  the  graph  changes  the  data  and 
vice  versa.  Adding  or  deleting  nodes  in  a  graph  add  or  delete  the  set  entries  in  the  MIMI 
database.  Adding  or  deleting  edges  add  or  delete  entries  in  MIMI  tables.  Thus,  the  graphs 
themselves  become  a  natural  user  interface. 

Each  node  and  edge  in  a  graph  has  a  domain  defined  by  the  sets  and  set  entries  associated  with 
the  node  or  edge.  If  the  nodes  in  a  graph  were  generated  as  the  entries  in  a  particular  set,  then 
the  domain  of  each  node  can  be  as  simple  as  a  tuplet  listing  the  defining  set  and  the  entry 
associated  with  each  node.  However,  the  nodes  in  many  graphs  are  defined  on  complex  domains 
represented  by  several  tuplets  of  defining  sets  and  set  entries. 

In  general,  graphs  are  generated  for  a  large  variety  of  domains  but  displayed  selectively  only  for 
a  few  domains.  The  manipulation  of  the  domains  that  define  which  portions  of  the  graph  to 
display  and  which  to  hide  is  called  graph  navigation.  Since  the  graphs  in  industrial  applications 
are  generally  quite  large  (with  hundreds  or  even  thousands  of  nodes  and  edges),  efficient  graph 
navigation  is  key  to  the  success  of  MIMI’s  graph-based  modeling  development  efforts. 

For  example,  we  might  choose  to  generate  a  graph  of  a  large  portion  of  an  LP  matrix  with  nodes 
defined  as  matrix  columns  from  set  MAC  and  matrix  rows  from  set  MAR.  Edges  would 
represent  nonzero  matrix  elements  from  the  sparse  table  MATX(MAR,MAC).  Each  matrix 
column  also  has  a  domain  associated  with  the  meaning  of  the  column  in  physical  terms-blending 
activity,  time  period  1,  product  PA,  location  BR,  etc.  These  domains  are  also  associated  with 
the  node  representing  the  matrix  column  so  that  we  can  navigate  the  matrix  graph  by  specifying 
a  filter  of  partial  domains  for  display. 
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Graph  nodes  are  treated  as  objects  in  the  object-oriented  programming  sense.  When  the  user 
selects  a  node  with  the  left  or  right  mouse  button,  the  graph  tells  the  MIMI  database  the  domain 
(of  the  node)  the  user  has  selected.  MIMI  macros  or  rules  can  be  linked  to  nodes  so  that  they 
will  be  run  or  fired  upon  mouse  selection.  Nodes  can  also  be  associated  with  additional  data 
structures  called  ft-ames  which  will  pop  up  editable  windows  focused  on  the  current  domain  of 
interest  with  selection  with  the  rigbr-mouse  button. 

The  MIMI  database  supports  inheritance  and  so  does  the  ftame  feature  associated  with  the 
right  mouse  button.  Any  text  selected  by  the  right  mouse  button  is  referenced  through  MIMI’s 
database  structures  to  present  a  window  with  the  correct  data  (perhaps  inherited)  focused  on 
the  active  domain  of  interest 

Node  and  edge  shapes  can  be  selected  from  a  list  of  standard  shapes  or  ftom  external  pixmaps 
supplied  by  the  user.  Thus,  graphs  can  also  be  used  to  create  icon-style  interfaces  for  mouse 
selection.  Pixmaps  can  also  be  imported  as  background  (e.g.  maps,  plant  layouts)  for  superim¬ 
posing  graph  structures  related  to  data. 

Quite  often,  the  x,y  positions  of  graph  nodes  are  related  to  data  in  the  MIMI  database;  however, 
in  some  cases  we  would  like  the  x,y  positions  of  the  nodes  to  be  controlled  by  edge  relationships. 
For  these  graphs,  MIMI/G  contains  six  layout  routines  which  can  be  sp^nfied  as  part  of  the 
graph  set  definition. 

Node/edge  relationships  in  graphs  often  reveal  structure  at  a  glance.  However,  they  also 
provide  a  natural  interface  for  delving  into  the  MIMI  database  along  the  lines  indicated  by  the 
mouse  selection  of  the  user  in  an  intuitive  form. 

The  visualization  of  relations  and  data  structures  provides  insight  in  many  novel  forms.  Ob¬ 
viously,  many  physical  problems  have  natural  graphical  interpretations,  and  we  would  expect 
that  it  would  be  easy  to  represent  the  data  structures  describing  these  problems  in  a  graph-based 
modeling  form.  In  all  cases  in  which  we  found  the  specification  of  a  graph  set  difiBcult  for  a 
natural  problem,  upon  examination  we  found  that  the  data  structure  representing  the  problem 
to  be  inefficient  or  urmatural-an  observation  which  had  escaped  us  prior  to  graphical  visualiza¬ 
tion. 

The  graph-based  modeling  language  in  MIMIA3  is  quite  simple  to  learn  and  to  use.  Many  graphs 
can  be  specified  with  a  few  lines  that  relate  graph  attributes  to  database  structures.  MIMI/G 
fadh'tates  the  modeling  process  by  allowing  classes  of  graphs  to  be  defined  and  by  allowing  other 
graphs  to  inherit  their  properties.  Thus,  neophyte  users  are  able  to  specify  new  graphs  on  the 
basis  of  a  collection  of  example  graphs  without  completely  understanding  the  process.  Never¬ 
theless,  the  ability  to  develop  novel  applications  of  graph-based  modeling  seems  to  be  limited 
to  a  few-probabfy  the  same  small  sub^t  of  people  who  are  good  at  modeling  in  general 

Since  the  MIMI/G  development  is  new,  our  observation  of  users  is  limited.  However,  initial 
results  indicate  that  there  is  a  high  degree  of  acceptance  of  the  graph-based  interfaces  among 
model  developers  and  end  users  alike. 
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Karmarkar's  algorithm  (Karmarkar  1984)  and  other  interior 
point  methods  are  now  regarded  as  a  competitive  methods  for 
solving  linear  programming  (LP)  problems.  It  is  therefore 
worth-while  to  undertake  development  of  a  professional  LP 
software  based  on  some  particular  interior  point  method.  We 
describe  design  and  implementation  aspects  of  LPINT,  an  LP 
software  package  which  is  based  on  the  primal-dual  interior 
point  algorithm.  Its  main  characteristics  can  be  stated  as 
follows : 

-  high  performance  is  assured  by  using  state  of  the  art 
algorithms  (Lustig  et  al.  1991,  Mehrotra  1991,  Altman, 
Gondzio  1992)  and  recent  results  in  sparse  matrix  research 
(George,  Liu  1981,  Duff  et  al.  1989). 

-  the  overall  system  design  is  influenced  by  proven  systems 
which  are  based  on  the  simplex  method  (Suhl  1989).  It  is 
also  very  important  to  follow  the  methods  and  principles 
of  contemporary  software  engineering.  For  example,  it  is 
desirable  to  attain  a  high  degree  of  portability.  We  also 
emphasize  the  need  for  modularity,  ease  of  use  and  other 
software  qualities.  Our  goal  was  to  obtain  these  qualities 
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without  abandoning  the  usage  of  standard  form  PC  user 
interface  (pop-up  and  pull-down  menus,  windows  etc.) 

-  two  levels  of  use  are  provided:  1.  interactive  menu  driven 
use , 

2.  as  a  library  of  fortran  subroutines  which  are  driven  by 
the  user  program 

-  LPINT  was  extensively  tested  using  the  so  called  NETLIB 
library  of  LP  test  problems  (Gay  1985). 

At  each  step  of  the  primal-dual  interior  point  it  is  neces¬ 
sary  to  solve  the  presumably  sparse  linear  least  square 
problem.  It  is  generally  accepted  that,  if  problem  dimension 
is  not  very  big,  the  normal  equations  approach  using  sparse 
Cholesky  factorization  (George,  Liu  1981)  is  an  adequate 
method.  The  rows  and  columns  of  the  normal  equations  matrix 
must  be  preordered  in  order  to  exploit  sparsity.  We  have 
implemented  the  following  methods  for  doing  this:  1.  Minimal 
degree  algorithm  (usually  the  most  efficient  method),  2. 
Nested  dissection  method  (uses  the  same  data  structure  as 
minimal  degree  algorithm,  but  it  is  less  efficient  in  ex¬ 
ploitation  of  sparsity  when  LP  problems  are  considered),  3. 
Reverse  Cuthill-McKee  algorithm  (a  standard  profile  method 
which  proved  to  be  more  efficient  than  minimum  degree  algo¬ 
rithm  on  some  rare  cases,  but  otherwise  its  performance  is 
poor),  4.  Modified  Levy  algorithm  (an  alternative  profile 
method  (Billionnet,  Breteau  1989)).  Among  other  algorithmic 
techniques  implemented  within  LPINT  we  can  mention  splitting 
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dense  columns  (Gondzio  1991).  In  general,  our  goal  was  not 
to  invent  new  algorithms  but  to  enable  making  different 
comparisons  as  a  starting  point  for  further  investigation  of 
the  algorithms  and  LP  matrices.  Perhaps  the  most  distin¬ 
guished  components  of  LPINT  are  different  tools  for  graphi¬ 
cal  display  of  LP  constraint  matrices  and  corresponding 
normal  equation  matrices.  A  visualization  of  these  matrices 
could  be  particularly  helpful  when  one  must  decide  about 
efficient  solution  strategy  and  possible  decomposition  of 
the  LP  model.  Our  ultimate  goal,  which  is  not  yet  fully 
achieved  is  to  create  an  open  software  environment  for 
handling  and  analysing  different  sparse  matrices  (Alvarado 
1990)  which  is  to  be  specialized  for  LP  matrices.  We  believe 
that  all  mentioned  features  make  LPINT  a  valuable  tool  for 
postgraduate  education  and  research  in  the  field  of  LP. 


external  files 
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LPINT  system  architecture 
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LPINT  is  currently  implemented  only  on  the  PC,  but  the  usage 
of  portable  graphical  subroutine  library  ( Interacter  1991) 
makes  possible  its  porting  to  other  software  and  hardware 
platforms  . 
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In  previous  works  we  designed  a  general  method  for  linear  diophantine  constraint  satisfaction 
problem  -denoted  by  FAST  (Fast  Algorithm  for  the  constraint  Satisfaction  Testing)-  which  allows  to 
prove  the  existence  or  not  of  a  solution  for  a  system  of  constraints  over  a  finite  domain. 

Namely,  the  system  has  the  following  canonical  type ; 

(S)  Ax^;  X  e  D.  D  di.screte  and  finite 

(all  the  components  of  the  vector  b  and  each  coefficient  of  the  matrix  A  are  integer  numbers). 

This  problem  arises  in  several  applications  in  computer  .science,  namely  in  Artificial  Intelligence 
area  (such  that  :  logical  inference  and  SAT  problems,  regular  problems  (pigeon,  queen,  puzzle,,.., 
and  for  constraint  logic  programming),  and  in  automatic  vectorization  of  programs. 

The  main  characteristics  of  our  method  is  the  solving  of  a  sequence  (very  short  in  practice)  of 
integer  programming  problems.  Each  generic  problem  of  this  sequence  has  an  appropriate  objective 
function  and  a  constraint  system  size  lower  than  the  initial  system  one  (very  much  lower  in  praence). 

The  algorithm  stans  from  an  initial  vector  x®  in  D.  Let  us  denote 

L  the  subset  of  the  m  constraints  of  (S)  already  .satisfied  (c.g.  Ajx®  <  bj  ie  L) 
and 

G  the  subset  of  the  other  constraints  (e.g.  Ajx®  >  bj  i  e  G). 

From  this  staning  point  x®,  algorithm  FAST  generates  a  finite  sequence  of  k  integer  vectors  x‘, 

x2,  x^ . xl^  in  D,  until  either  x*^  satisfies  the  system  of  constraints  (S),  or  the  associated  domain 

(F(S)={x  €  D  I  Ax  <  b)  )  is  proved  to  be  empty. 

Namely,  given  an  element  x^  of  the  sequence  which  is  not  a  solution  of  (S)  (with  h  <  k-1),  by 
denoting  again 

L  the  subset  of  the  m  constraints  of  (S)  already  satisfied  (e.g.  A/xh  <  bj  ie  L) 
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and 

G  the  subset  of  the  other  constraints  (e.  g.  Ajx''  >  bj  i  6  G), 
the  next  integer  vector  x*  (e.g.  xh+*  )  is  obtained  by  solving  (or  partially  solving)  the  following 
integer  linear  programming  problem 


(p) 

min  f(x) 

S.t.  Ai  X  <  bi 

i  6  L 

x  6  D 

where  f(x)  is  a  positive  linear  combination  of  the  ( Aix)i6  q  : 

f(x)=  X  ttj  Ai  X  ,  a;  S  0  for  ail  i  in  G. 
ieG 

The  number  of  iterations  of  our  method  is  bounded  by  the  number  of  contraints  of  the  initial 
system,  but  at  each  iteration  a  NP-complete  integer  programming  problem  is  solved  exactly  or 
approximately. 

We  propose  to  describe  a  specific  version  of  algorithm  FAST  -denoted  BFAST-  devoted  to  the 
exact  solution  of  linear  boolean  constraint  satisfaction  problems.,  e.g.  with  a  system  of  this  type 

(BS)  AxSb;xe  (0,1)" 

As  a  matter  of  fact,  it  is  well  known  that  a  propositional  logic  clause  can  be  written  as  a  0-1  linear 
inequality  in  the  following  way  : 

the  clause 

ti  V  -it2  V  -it3  V  t4 

is  equivalent  to  the  diophantine  constraint 

xi+(  1-X2)+(1-X3)+X4  >  1,  with  xj  in  (0,1 ),  i=l . 4. 

This  means  that  each  xj  is  a  mathematical  variable  rather  than  a  proposition  and  is  interpreted  as 
having  the  numerical  value  1  when  the  proposition  t;  is  true  and  0  when  tj  is  false.  The  numerical 
inequality  asserts  that  at  least  one  of  the  fourth  literals  is  true. 

Thus  a  set  of  clauses  can  be  written  as  a  system  (BS)  corresponding  to  a  generalized  covering 
problem  :  all  the  components  of  the  right-hand  side  b  of  the  constraints  are  integer  numbers,  and  each 
coefficient  of  the  left-hand  side  A  of  the  constraints  belongs  to  (-1,  0,  1)),  each  row  of  which 
corresponds  to  a  clause. 

Imponant  typical  applications  arc  the  inference  problem  in  propositional  logic,  and  deductive 
databases. 

Although  this  constraint  satisfaction  problem  associated  with  (BS)  is  NP-completc,  it  is  possible 
to  design  efficient  exact  methods  for  several  class  of  such  instances.  Recently  Hooker  has  obtained 
good  results  using  0-1  programming  tools :  his  method  consists  in  adding  an  objective  function  to  the 
constraint  system  (B."' '  •  i  order  to  solve  an  equivalent  O- 1  programming  problem  by  a  branch  and  cut 
method . 
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The  associated  computational  experiments  show  clearly  that  his  algorithm  is  largely  as  fast  as  the 
classical  previous  symbolics  methods  (set  of  support  resolution,  Davis-Putnam's  procedure, ...)  for 
logical  inference  problems. 

Our  BFAST  algorithm  solves  a  sequence  of  0-1  programming  problems  (obtained  by  adding  an 
objective  function  to  subsystems  of  (S)).  ^e  solution  of  each  generic  0-1  problem  is  obtained  by  a 
branch  and  bound  method  including  heuristics,  relaxations  and  reduction  procedures. 

The  associated  C  code  has  been  implemented  on  a  SUN  3/160  computer  with  a  lot  of  instances 
with  a  generalized  covering  type 
Ax  >  b;  X  €10,11" 
with  A  €  |0, 1,-1 and  b  €  Z"’ 

randomly  generated  with  the  Purdom  and  Brown  model.  Each  clause  is  randomly  and  independently 
generated:  each  literal  has  the  same  probability;  each  clause  includes  distinct  literals. 

The  computational  experiments  show  the  efficiency  of  our  method  BFAST. 
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Abstract:  The  problem  of  bounding  the  expected  value  of  the  objective 
function  in  a  stochastic  program  can  be  of  interest  in  its  own  right  (for 
example  finding  the  expected  project  duration  time  in  a  stochastic  PERT 
network)  or  it  can  be  a  part  of  a  larger  setting  such  as  for  example  a  two- 
stage  stochastic  program.  We  consider  a  general  LP  of  the  form:  Find 
the  expected  value  of  Q,  where  Q  is  given  by 

Q  =  ^  rain{gy  \Wy  =  iJi,  y>  0}p, 


where  we  view  Ui  as  the  rth  realization  of  a  random  variable  lj,  with  p, 
being  the  probability  that  =  u;,.  Finding  the  exact  value  of  Q  is  hard 
expect  for  very  small  problems.  However,  for  general  LPs  there  exist 
different  approaches  for  bounding  Q,  such  as  the  Jensen  lower  bound  and 
the  Edmundson-Madansky  upper  bound.  Whichever  bound  is  used,  one 
will  often  experience  that  the  bounds  are  not  tight  enough  according  to 
some  chosen  rule.  A  natural  possibility  is  then  to  partition  the  support 
of  and  then  find  conditional  bounds  on  each  celi  of  the  partition. 

In  this  paper  we  discuss  different  ways  of  partitioning  the  support.  It  is 
fairly  obvious  that  one  will  always  partition  the  cells(s)  with  the  largest 
error  (where  error  is  measured  as  the  difference  between  the  upper  and 
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lower  bound  multiplied  by  the  probability  associated  with  the  cell).  How¬ 
ever,  given  a  cell,  one  must  decide  how  to  do  it.  Due  to  the  difficulty  of 
finding  conditional  expectations  over  anything  but  rectangles  we  immedi¬ 
ately  decided  to  consider  only  partitions  that  affect  one  random  variable 
at  a  time.  Also,  after  some  preliminary  testing,  we  decided  to  split  a 
cell  in  the  middle,  i.e.  as  close  to  the  midpoint  between  the  minimal  and 
maximal  value  as  possible.  Of  course,  one  could  also  have  chosen  the 
mean  or  median.  Our  computations  indicate  that  that  is  less  useful,  but 
that  the  difference  is  not  substantial. 

However,  our  main  issue  is  to  understand  better  which  dimension  (ran¬ 
dom  variable)  to  partition  on.  Our  results  indicate  very  clearly  that 
picking  the  correct  dimension  is  crucial.  This  is  perhaps  best  understood 
if  we  for  a  moment  assume  that  we  introduce  a  random  variable  that 
does  not  show  up  anywhere  in  the  LP.  If  we  chose  to  split  on  this  ran¬ 
dom  variable,  the  bounds  will  remain  unchanged,  but  we  now  have  two 
cells,  each  as  difficult  as  the  first  one.  Hence,  we  must  bring  both  ceils 
down  to  an  acceptable  error  and  this  is  basically  twice  as  haord  as  bringing 
the  error  associated  with  the  original  cell  down.  In  other  words,  picking 
an  incorrect  random  variable  has  doubled  our  workload.  Of  course,  we 
never  have  such  random  variables  in  a  problem.  But  we  will  often  have 
random  variables  that  are  totally  uniteresting  (for  example  the  duration 
of  an  activity  in  a  PERT  network  which  is  such  that  irrespective  of  the 
value  taken  by  this  random  variable  the  activity  is  never  critical).  The 
problem  with  an  incorrect  choice  is  that  we  never  recover  from  it.  With  a 
totally  useless  partition,  the  remaining  workload  associated  with  the  cell 
in  question  basically  doubles,  and  that  cannot  be  offset  later  on. 

Given  this  important  observation  we  discuss  a  number  of  basic  approaches 
to  the  question  of  how  to  pick  the  right  random  variable.  These  basic 
approaches  are  combined  with  different  shortcuts.  For  example,  if  we 
foresee  that  a  given  partition  will  cause  one  of  the  resulting  new  cells 
to  immediately  satisfy  the  error  bounds,  we  chose  this  partition  without 
further  testing. 

The  talk  will  outline  these  approaches  and  present  numerical  results.  We 
demonstrate  that  bad  choices  can  increase  the  workload  several  orders  of 
magnitude. 
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Extended  abstract 


One  of  the  commonly  imposed  assumptions  in  the  classical  sheduling  theory 
is  that  any  task  is  processed  by  one  processor  at  a  time  [10,  1].  With  the 
development  of  technology,  parallel  systems  and  parallel  algorithms  this  as¬ 
sumption  is  not  so  obvious.  For  example  consider  a  fault  tolerant  system  in 
which  several  processors  test  each  other  [13]  or  a  testing  system  in  which  one 
processor  stimulates  the  tested  object  and  the  other  processor  is  analysing 
its  output.  Another  range  of  applications  appears  in  the  field  of  new  parallel 
algorithms  and  corresponding  tasks  systems. 

In  recent  years  several  papers  dealt  with  a  problem  in  which  a  taisk  re¬ 
quires  more  than  one  processor  simultaneously.  Two  groups  of  models  have 
been  distinguished.  In  the  first  group  of  models  it  is  assumed  that  any  task 
can  be  executed  on  any  set  of  processors  under  the  condition  that  a  fixed 
number  oi  processors  is  assigned  to  the  task  [6,  11,  7,  9,  15).  There  are  three 
models  in  this  group  [16]:  in  the  model  called  ’’size/’  a  task  requires  a  fixed 
number  of  processors  simultaneously  [6,  7];  in  the  model  ”cu6e/’  a  task  re¬ 
quires  a  number  of  processors  which  is  a  power  of  2  (eg.  either  1  or  2  or  4 
etc.  processors)  [9,  15];  in  the  model  ”ony”  each  task  can  be  executed  on 
any  subset  of  the  processors  but  the  execution  speed  depends  on  the  number 
of  processors  processing  the  task  [11,  17]. 

In  the  second  group  of  models  it  is  assumed  that  the  number  of  processors 
is  not  important,  but  the  set  of  processors  processing  a  task  [14,  3,  5].  This 
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37 


problem  is  similar  to  classical  scheduling  with  eidditional  resources  [8]  and  can 
be  expressed  in  terms  of  weighted  graph  colouring  [14].  There  are  two  models 
of  in  this  group.  These  are:  model  ”  fix j”  where  a  task  can  be  executed  by 
a  fixed  set  of  processors  [14,  3,  5]  2md  model  "set,”  in  which  each  task  has  a 
set  of  alternative  sets  of  processors  by  which  it  can  be  processed. 

In  this  paper  we  will  concentrate  on  the  model  ’’sety”  which  is  a  gener¬ 
alization  of  the  model  ”/txy”.  Before  presenting  results  we  will  set  up  the 
problem  more  formally. 

We  are  given  set  T  of  n  tasks  and  set  V  o(  m  dedicated  processors.  Each 
task  Tj  requires  for  its  processing  a  set  of  processors  Di  simultaneously  from 

\Si\ 

a  set  Sj  of  such  sets  (ie.  Sj  =  (JA)-  We  will  call  these  Di  sets  processing 

Isl 

modes  or  processing  configurations  of  task  Tj. 

A  processing  time  of  a  task  may  depend  on  the  set  of  processors  processing 
it.  We  assume  that  processing  times  of  tasks  are  given  in  the  matrix: 

X={i^‘  :  tf'  €  is  a  processing  time  of  task  Tj  in  processing  mode  i 
requiring  a  set  of  processors  A;  i^  cannot  be  scheduled  in  this  mode  then 

tf  =  +00  } 

Tasks  are  independent.  We  will  analyze  preemptable  and  nonpreemptable 
task  cases.  In  case  of  preemptable  tasks  any  task  can  be  at  no  cost  interrupted 
and  restarted  later  probably  in  different  processing  mode.  In  this  case  we  also 
assume  that  processing  percentages  of  tasks  processed  in  various  processing 
modes  are  “additive”  or  in  other  words  can  be  accumulated.  For  example 
if  some  task  has  been  processed  1  second  in  processing  mode  A  while  the 
total  processing  time  for  this  task  in  this  mode  is  10  seconds,  then  the  task 
is  processed  in  10%.  If  next,  this  task  has  been  processed  in  additional  20% 
in  some  other  processing  mode  then  it  is  processed  in  30%.  After  restarting 
in  the  processing  mode  A  this  task  will  occupy  processors  appropriate  in  this 
mode  in  7  seconds.  This  approach  is  similar  to  the  case  of  scheduling  on 
unrelated  machines  or  scheduling  under  resource  requirements  [4]. 

An  optimality  criterion  is  schedule  length  (C>nox)- 

To  de^jote  analyzed .  problems  we  will  use  an  extended  version  of  the 
scheme  proposed  by  Graham,  Lawler,  Lenstra  and  Riimoy  Kan  [12]  with 
later  extensions  [8,  16].  In  this  notation  a  scheduling  problem  is  described 
by  three  fields.  The  first  field  describes  processor  system.  In  this  work  it  will 
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be  letter  P  optionally  followed  by  positive  integer  which  denotes  the  number 
of  processors.  If  there  is  no  constant  after  P  then  the  number  of  processors 
is  not  fixed  and  is  given  in  the  current  instance  of  the  problem.  The  second 
field  describes  the  task  system.  Word  ”pmtn”  is  used  to  denote  that  tasks  are 
preemptable,  if  this  word  is  absent  tasks  are  nonpreemptable.  Word  ”setj” 
denotes  simultaneous  requirement  of  multiple  processors  by  tasks.  More¬ 
over  in  general  any  task  can  be  processed  by  more  than  r  ?  such  a  set  of 
processors.  The  last  field  denotes  the  optimality  criterion,  ii  is  Cmox* 

In  the  paper  we  will  present  a  dynamic  programming  based  procedure  to 
solve  optimally  simple  cases  of  the  nonpreemptive  version  of  the  problem. 
This  will  result  in  pseudopolynomial  algorithms.  For  a  general  case  of  the 
nonpreemptive  scheduling  a  heuristic  will  be  proposed  and  its  worst  case 
behavior  will  be  analyzed.  The  preemptive  version  of  the  problem  will  be 
solved  via  linear  programming.  The  organization  of  the  paper  is  as  follows. 
In  section  2  the  case  of  nonpreemptive  scheduling  is  considered.  In  section  3 
the  preemptive  version  of  the  problem  is  solved. 


Nonpreemptive  Scheduling 

In  general  the  problem  P)  sei,-  |  Cmax  is  NP-hard.  This  can  be  eaisily  shown 
by  a  reduction  from  the  set  partition  problem  to  problem  P2|  setj  |  Cmo*- 
For  three  processors  and  tasks  requiring  processors  from  only  one  set  the 
problem  is  NP-hard  in  the  strong  sense  [5].  Thus,  it  is  unlikely  to  propose 
an  algorithm  solving  these  problems  in  polynomial  time.  Moreover,  for  more 
than  two  processors  it  is  hard  to  expect  pseudo-polynomial  time  algorithm. 

In  this  section  we  will  present  pseudo-polynomial  time  algorithms  for 
problems  P2|  setj  \  Cmax  and  a  restricted  version  of  the  problem  P3|  setj  \ 
Cmaxt  respectively.  Then  a  simple  heuristic  for  the  problem  P|  setj  \  Cmax 
with  the  worst  case  behavior  bound  equal  to  m,  will  be  presented. 

Preemptive  Scheduling 

In  this  section  we  will  analyze  the  problem  P|  setj,pmtn  [  Cmax-  In  general 
(when  the  number  of  processors  is  unbounded),  the  problem  in  question  is 
NP-hard  in  the  strong  sense  [2].  For  a  limited  number  of  processors  however, 
this  problem  can  be  solved  in  polynomial  time  using  linear  programming 
procedure. 
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Abstract 

The  population  of  parallel  genetic  algorithms  (PGAs)  can  easily  be  split  up 
to  match  the  needs  of  a  coarse  grained  parallelism.  A  cluster  of  interconnected 
workstations,  seen  as  an  MIMD-architecture,  is  the  chosen  hardware  to  express 
this  kind  of  parallelism.  A  PGA  implementation,  as  any  other  parallel  algorithm, 
is  bound  to  an  execution  environment,  giving  sustained  support  for  its  realization. 
Our  execution  environment,  called  PARNET,  is  constructed  as  a  distributed 
server  referred  to  as  base  layer.  The  next  higher  level  of  abstraction  is  provided 
by  the  object  oriented  interface  layer,  allowing  to  construct  the  PGA  layer  on  top 
of  both.  We  introduce  the  PARNET  conception,  leading  from  the  single  machine 
operating  environment  to  the  distributed  realization  of  a  PGA. 


1  Introduction 

Genetic  algorithms  belong  to  a  class  of  optimization  strategies,  which  can  be  imple¬ 
mented  most  efficiently  on  MIMD-architectures.  A  PGA  population  matches  the  needs 
of  a  coarse  grained  parallelism  for  efficient,  almost  asynchronous  processing  [1]. 

The  realization  in  a  cluster  of  interconnected  workstations  must  be  supported  by  an 
execution  environment,  providing  the  necessary  abstraction  of  the  interconnecting  net¬ 
work.  While  distributed  operating  systems  like  Helios  provide  and  prescribe  a  spe¬ 
cialized  support  for  distributed  computing  in  general,  and  therefore  allow  a  kind  of 
abstraction  from  single  processing  nodes,  this  support  is  missed  on  a  network  of  gen¬ 
eral  purpose  workstations.  In  this  context  facilities  like  remote  procedure  calls  and 
remote  execution  can  only  be  seen  2^  a  basic  access  to  the  potential  summable  com¬ 
putational  power  of  a  workstation  cluster. 

The  PARNET  computation  environment  is  an  approach  to  provide  better  access  to 
the  overall  computation  power  in  distributed  environments.  Needed  administration 
for  distributed  objects  can  be  done  on  a  per  application  base,  instead  of  the  operating 
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system  level.  So  the  PARNET  functionality  is  a  subset  of  what  could  be  expected  from 
a  distributed  operating  system.  Typical  problems  of  these  systems  can  be  avoided  or 
at  least  alleviated  and  implemented  more  efficiently. 


2  PARNET 

2.1  Overall  Structure 

The  overall  structure  of  PARNET  is  a  per  application  distributed  server.  The  parts 
of  this  server  are  constructed  following  the  layer-model  shown  in  figure  1.  In  each 
processing  node,  there  is  at  least  one  part  of  the  server  running.  The  base  layer  han¬ 
dles  main  topics  of  abstraction  from  the  network  environment  and  provides  low  level 
message  passing  and  semaphore  based  synchronization  crossing  the  machine  bound¬ 
aries.  The  application  programmer  can  use  the  base  layer  interface  directly  or  rely  on 
the  next  abstraction  level,  provided  by  the  interface  layer.  This  layer  provides  more 
complex  synchronization  features  in  an  object  oriented  fashion.  Build  on  top  of  this 
layer  some  framework  for  implementing  parallel  algorithms  is  provided.  Here  we  show 
a  PGA-layer  matching  the  needs  for  coarse  grained  parallel  processing  of  genetc  al¬ 
gorithms.  But  it  may  be  for  example  exchanged  by  another  layer,  e.  g.  supporting  a 
general  framework  for  event  driven  simulation. 


PGA  Layer 
Interface  Layer 
Base  Layer 


Figure  1:  PARNET  layer  model 

The  implementation  is  built  on  the  widely  accepted  progr2unming  model  of  multi¬ 
threaded  tasks,  available  on  state  of  the  art  operating  systems  (i.e.  Mach,  Solaris  2.x, 
Helios)  together  with  a  high  reliable  communication  facility  (TCP/IP,  Helios  message 
passing)  [5].  The  thread  programming  model  must  at  least  provide  a  fork()  call  to 
start  threads  and  binary  semaphore  operations  {P(),  V()  or  SignalQ,  Wait()). 

2.2  Base-Layer 

Providing  a  flexible  and  extensible  base  environment  leads  to  the  main  functionality  of 
the  base  layer: 

•  initial  booting  from  the  local  workstation, 

•  crash  detection  for  remote  computing  nodes, 

•  hierarchical  organized  naming  scheme  for  several  kinds  of  objects, 

•  eflUcient  message  transfers  by  building  optimal  sized  data  packets, 

•  dynamic  creation  of  name  addressable  communication  ports. 
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To  operate  on  a  per  application  base,  the  PARNET  environment  is  bound  to  the 
application  itself  and  spreads  out  beginning  from  the  local  machine.  Because  of  this 
mechanism  an  errorness  application  can  only  crash  down  itself,  whereas  a  system  service 
like  implementation  may  affect  further  applications  directly  or  indirectly. 

A  minimal  necessary  support  for  distributed  applications  is  a  crash  detection  facility. 
Any  desirable  crash  recovery  is  at  least  somehow  coupled  with  the  application,  or  leads 
to  a  specialized  paradlelization  paradigm,  for  which  the  problem  of  recalling  a  past  state 
can  be  solved  efficiently.  Because  we  will  not  concentrate  on  a  specialized  parallelization 
method  in  the  base  layer,  we  support  crash  detection  and  following  shutdown  of  the 
application. 

For  naming  and  addressing  of  distributed  objects  in  PARNET,  we  provide  a  hierar¬ 
chical  name  space.  Again  we  benefit  from  the  per  application  approach  in  PARNET. 
It  is  obvious,  that  an  application  can  trust  itself,  and  therefore  no  access  control  or 
authorization  is  needed.  The  object  name-tree  is  distributed  between  the  server  parts, 
allowing  local  interpretation  per  context.  A  caching  strategy  is  used  to  avoid  unneces¬ 
sary  communication  requests  in  searching  objects  {locate ()).  Figure  2  shows  a  snapshot 
of  a  typical  name  space.  Terms  in  quotation  marks  denote  user  created  objects,  which 
may  be  functions,  classes  and  their  methods,  mail-ports  and  further  object  names. 
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-00- 


—  classes- -  - 

—  functions  -  - 

—  mail-ports  ■ 


“indivtduar - 

“accessahk  funcUons" 
“matl-port  names” 


“geneitc  methods” 


Figure  2:  PARNET  name  space 

The  addressing  of  any  object  in  PARNET  is  based  on  this  naming-scheme.  As  an 
example,  if  we  address  data  to  "/scylla/Ol/functions/beep”,  on  host  "scylla”,  server- 
part  ”01”,  a  function  beep  is  called. 

Addressing  data  to  ”mail-port/incoming.data”  will  search  for  a  mail-port  with  the 
name  ”incoming_data”  first  in  the  local  context  and  second  in  a  remote  context.  The 
first  occurrence  of  the  name  is  the  address  the  data  will  be  delivered  to. 

Although  communication  shows  to  be  still  the  bottleneck  in  most  distributed  applica¬ 
tions,  some  afford  can  be  done,  to  alleviate  this  fact  [8]. In  our  multithreaded  implemen- 
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tation  there  is  a  sending  thread  for  data  communication  to  each  remote  server-part. 
So  sending  data  occurs  only  if  this  thread  is  actually  executing.  It  is  a  quite  natural 
approach  to  pack  together  as  many  messages  as  currently  available  in  order  to  have  a 
good  chance  to  achieve  efficient  transfer  rates. 

Traditionally  only  static  structures  for  communication  are  realized  in  distributed  ap¬ 
plications.  This  fact  is  normally  imposed  by  the  underlying  support-tool  (Interface 
Description  Language  (RPC),  or  Component  Distribution  Language  -  Helios).  Hence 
dynamic  creation  of  communication  ports  is  an  unusual  feature,  allowing  to  send  data 
to  mail-ports  which  are  not  created  so  far  and  assuming,  that  the  addressed  mail-port 
will  be  created  in  near  future.  There  are  several  situations  where  this  may  simplify  the 
application  code  and  avoids  explicit  synchronization,  before  starting  to  communicate. 


2,3  Interface  Layer 

Object  oriented  programming  has  proved  to  be  a  worthy  software  engineering  approach. 
An  object  is  described  by  a  class  defining  a  data  type  for  which  access  to  data  is 
restricted  to  a  specific  set  of  access  functions.  The  PARNET  base  layer  is  designed  to 
handle  a  wide  range  of  parallel  distributed  applications.  Hence  access  at  the  abstraction 
level  of  the  base  layer  is  indispensable.  Interface  clzisses  are  used  to  provide  access  to  the 
PARNET  base  layer  at  a  higher  level  of  well  known  concepts  of  parallel  programming. 

The  PARNET  interface  layer  uses  the  model  of  invoking  objects  remotely  to  provide 
access  to  remote  data  and  control  of  remote  threads.  Application  data  can  be  declared 
as  a  class  and  can  be  handled  as  an  object  thereafter.  Threads  of  coarse  grained 
parallelity  are  defined  as  a  set  of  member  functions  of  a  class.  Access  to  remote  objects 
is  provided  via  numeric  object  identifiers. 

The  idea  of  using  classes  leads  to  the  object  oriented  model  of  inheritance,  which  is 
extensively  used  in  the  PARNET  interface.  A  subclass  can  be  derived  from  its  base 
class,  while  the  properties  of  the  base  class  are  inherited  to  the  subclass.  Inheritance 
is  used  to  gain  access  to  the  base  layer  at  different  levels  of  well  known  programming 
concepts.  The  level  of  abstraction  grows  analogous  to  class  tree  inheritance.  Access  is 
permitted  at  any  level  of  abstraction.  Calling  a  meaningful  operation  on  application 
level  forces  the  processing  of  a  unique  chain  of  arbitrary  operations  along  the  directed 
graph  of  the  interface  cl2iss  hierarchy. 

Let  us  consider  a  simple  example.  At  a  low  level  of  abstraction  a  locking  primitive  is 
provided  by  the  PARNET  base  layer.  The  claiss  Monitor  is  derived  from  Locking  and 
provides  a  Hoare  monitor  using  the  inherited  properties  from  Locking.  In  a  next  step 
Monitor  inherits  its  functionality  to  Semaphore,  which  builds  a  counting  semaphore 
using  Monitor  methods.  In  a  last  step  Fifo  Semaphore  is  derived  from  Semaphore, 
using  a  couple  of  semaphores  to  construct  the  Fifo  behavior. 

Figure  3  introduces  the  simplified  PARNET  interface  hierarchy  of  derived  classes.  Base 
cl2isses  (appear  in  roman  characters)  are  wrappers  around  the  PARNET  base  layer. 
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Figure  3:  The  PARNET  interface  hierarchy 


These  abstract  classes  cannot  be  used  by  the  application  programmer.  They  are  base 
classes  from  which  user  classes  at  a  higher  level  of  abstraction  are  derived  (appear  in 
emphasized  characters). 

Four  base  classes  represent  the  functionality  of  the  PARNET  base  layer.  The  Message 
class  derives  two  more  base  classes,  which  provide  different  communication  interfaces 
to  the  programmer.  Different  kinds  of  Queues  are  provided  as  well  as  a  distributed 
virtual  Shared  Memory.  Above  the  latter  cleiss  well  known  communication  patterns  like 
single- writer-multiple-reader  can  be  built  using  the  Semaphore  class.  Threads  may  be 
spawned  in  a  Synchronous  or  in  an  Asynchronous  way;  a  kind  of  processor  group  facility 
is  provided  by  the  Topology  class.  The  Locking  base  class  inherits  its  functionality 
to  Barrier,  Condition,  and  Monitor.  These  classes  are  used  to  synchronize  threads. 
To  enable  threads  to  react  on  the  network  computing  environment,  an  Info  class  is 
provided. 

The  program  development  of  paraiie!  applications  is  simplified  by  the  usage  of  shared 
memory  in  comparison  to  message  passing.  A  drawback  using  shared  memory  in  a 
distributed  environment  is  the  relatively  slow  communication  medium  (e.  g.  ethernet). 
The  PARNET  interface  provides  features  to  take  decisions  on  the  distribution,  protec¬ 
tion  and  consistency  of  shared  data  to  minimize  the  base  layer  network  traffic. 

Using  this  interface  non  trivial  distributed  parallel  applications  can  be  built,  whose 
complexity  regarding  control  and  communication  patterns  surpasses  usual  remote  pro- 
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cedure  calls.  Once  invoked,  threads  may  have  ”a  life  of  their  own”  communicating  with 
other  threads  without  the  need  of  a  central  instance.  This  is  an  important  demand  for 
using  the  interface  by  Parallel  Genetic  Algorithms. 


2.4  A  PGA  Layer 

A  PGA  application  can  be  characterized  by  the  following  issues:  All  evolved  threads 
perform  the  same  non  trivial  procedure  of  iterations.  No  update  of  systemwide  infor¬ 
mation  has  to  be  done  frequently.  Instead,  global  information  is  either  readonly  or  rare 
and  of  weak  consistency  condition.  Most  steps  of  execution  can  be  done  using  local 
data,  which  is  private  to  each  thread. 


class  individual  derived  Iron  Pamet 
shared  data  my.solution 
shared  data  all.neighbors 

virtual  lunction  terminate 

virtU2J.  function  accept 


//  Class  uses  PARIET  interface 
//  Ny  oen  solution 
//  Other  solutions  I  hnoa 

//  Members  to  be  declared  in 
//  derived  classes 


function  main  //  The  predefined  GA  interface 

uhile  not  terminate  () 

select ed.neighbor  =  select  (all.neighbors) 

temp.solution  =  crossover  (ny.solution,  s elect ed.neighbor) 

tenp.solution  =  mutation  (temp.solution) 

temp.solution  =  local-opt  (temp.solution) 

if  accept  (temp.solution,  all.neighbors) 
my.solution  =  temp.solution 

end  while 
end  function 
end  class 


Figure  4:  An  abstract  PGA  interface  class  using  virtual  functions 

In  the  following  the  pseudo  code  of  a  G  A  thread  cleiss  is  presented  as  an  example  for  the 
implementation  of  user  interface  classes.  Figure  4  shows  an  abstract  class  Individual, 
which  is  derived  from  several  PARNET  interface  classes  to  provide  remote  facilities. 
The  shared  data  represents  the  current  solution  in  the  process  of  iteration  and  the 
access  to  the  solutions  of  other  executing  individuals.  Write  access  is  permitted  to 
myjolution,  which  may  be  updated  by  the  individual.  The  access  to  alLneighbors  is 
protected  by  a  readonly  constraint. 

Apart  from  shared  data,  there  is  only  one  member  function  main  defined.  This  is  the 
loop  called  at  invocation  time  by  the  thread  cl^lss.  It  contains  the  skeleton  of  a  generic 
PGA  [6].  For  a  number  of  iterations  defined  by  terminate  the  function  select  chooses  a 
suitable  partner  for  its  recombination.  The  crossover  operator  recombines  the  genetic 
code  of  the  current  solution  with  the  selected  neighbor  to  a  new  (temporary)  solution. 
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class  By.indiTidnal  derivad  froa  individual 
virtual  function  taninate 

body  of  function  —  //  Oafina  tha  inplanantation  to  ba 

and  function  //  azacutad  in  tha  intarfaca  nathod  nain 

virtual  function  accapt 
body  of  function  . . . 
and  function 
and  class 


Figure  5:  A  user  defined  PGA  class 

Mutation  and  local-opt  are  optional  operations  on  the  new  solution.  The  accept  oper¬ 
ation  compares  the  recombined  new  data  with  the  solutions  of  the  neighborhood  and 
indicates  whether  the  old  solution  is  to  be  overwritten  by  the  new  one.  Otherwise  the 
new  solution  could  not  dominate  the  older  one  and  is  discharged  immediately.  In  the 
case  of  overwriting,  the  new  solution  is  forwarded  to  the  individuals  being  neighbors 
to  the  one  we  looked  at. 

No  object  of  individual  can  be  invoked.  Instead,  the  abstract  class  inherits  its  interface 
to  a  user  defined  class.  This  user  defined  class  is  responsible  for  the  implementation 
of  the  functions  declared  virtual  in  the  base  clatss.  Figure  5  shows  an  implementation 
class  derived  from  individual,  defining  the  details  of  the  PGA  operations  used  by  the 
interface  class. 

program 

topology<my_individual>  population  //  Declare  a  topology  of  threads 

population. number(32)  //  Define  the  number  of  individuals 

population. exec ()  //  asynchronous  start  of  remote  threads 

popular ion. wait ()  //  Wait  for  tormination  of  edl  threads 

end  program 


Figure  6:  An  program  example  using  the  PGA  interface 

Figure  6  gives  a  program  example  of  the  above  described  interface.  A  topology  of 
individuals  is  defined  within  the  population  object.  The  member  function  number  is 
called  to  determine  the  size  of  the  population.  The  exec  member  invokes  32  remote 
individuals  asynchronously.  The  wait  member  blocks  until  all  individuals  terminate. 
The  called  member  functions  are  defined  in  base  classes  at  different  levels  of  abstraction. 

The  definition  of  the  methods,  introduced  in  figure  6,  are  derived  from  PARNET 
cl2isses.  A  review  to  figure  3  may  help  making  the  class  interdependencies  more  trans¬ 
parent.  The  method  exec  lives  in  class  Thread,  while  watt  is  defined  in  class  Asyn¬ 
chronous  and  number  is  a  member  of  class  topology.  The  resulting  source  code  repre¬ 
sents  the  structure  of  the  program  in  a  obvious  way  hiding  the  details  of  networking. 
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3  Conclusion 

We  presented  the  PARNET  approach,  leading  from  single  machine  computing  envi¬ 
ronments  to  a  realization  of  a  distributed  application  environment  in  a  workstation 
cluster.  The  object  oriented  approach  fits  best  the  coarse  grained  parallelism  we  wish 
to  realize.  Following  these  approaches  a  framework  for  parallel  genetic  algorithms  has 
been  introduced. 

Well  known  network  facilities,  socket  based  communication,  remote  execution  and  re¬ 
mote  procedure  calls  do  not  provide  the  desired  programming  support.  A  specialized 
distributed  operating  system  is  missing  the  desired  robustness  for  carrying  out  ev- 
erydays  work  (2).  The  PARNET  approach  situated  above  the  base  operation  system 
pursues  a  per  application  environment  to  avoid  or  at  least  alleviate  typical  problems 
of  distributed  computing. 

The  PARNET  idea  has  not  been  the  only  approach  until  now.  A  programming  tool, 
called  PVM,  with  little  different  goals  and  a  more  different  conception  h2ts  recently 
become  available  (3).  Some  other  related  work  concentrate  on  the  object  oriented 
interfacing  [2], [7], [4].  Some  affords  were  made  to  hide  nearly  any  distribution.  Our 
work  is  more  reflected  by  hiding  hard  to  use  details  of  networks  and  provide  the  ability 
to  visibly  express  distribution. 
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The  purpose  of  this  paper  is  the  description  of  a  system  that,  while  fully  exploiting  and 
integrating  the  advanced  features  of  the  Microsoft  Excel  commercial  spreadsheet 
software  and  Hewlett-l‘ackard's  NewWave  office  environment,  e.xtends  them  with  new 
optimisation  and  multiple  criteria  group  deci.sion  model  building  capabilities. 

The  new  features  include  a  meta-model  building  language  which  allows  the  automatic 
generation  of  a  class  of  mathematical  programming  spreadsheet  models  dynamically 
linked  to  public  and  private  databases.  These  models  can  immediately  be  used  for  both 
intuitive  experimentation  and  optimisation.  Figure  1  shows  a  sample  model. 
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An  archiving  tool  allows  the  user  to  save  and  later  retrieve  any  given  state  of  the 
model  together  with  a  freely  selectable  set  of  characteristic  indicators.  The  indicators 
belonging  to  different  saved  states  of  the  model  can  be  easily  compared  using  graphical 
charts.  Figure  2  shows  the  chart  used  to  compare  the  saved  states. 
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The  system  works  in  both  the  Apple  Macintosh  and  the  Microsoft  Windows 
environments.  Its  functionality  is  significantly  enhanced  with  inter  application 
communication  and  dynamic  data  exchange  tools  that  we  extended  beyond  the  built  in 
capabilities  of  the  commercial  environments  to  computers  connected  by  a  network. 

The  optimisation  and  multiple  criteria  decision  making  can  in  this  way  be  performed  in 
an  environment  where  dynamically  changing  data  originating  from  shared  data  bases  or 
other  members  of  the  decision  making  group  are  permanently  taken  into  acco.-nt. 
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The  decision  makers'  communication  needs  are  supported  at  two  different  levels.  The 
first  level  is  the  already  mentioned  dynamic  data  exchange.  A  hot  link  to  data  can  be 
established  without  requiring  the  intervention  of  the  user  who  is  linked  to  the  data. 
Figure  3  shows  the  corresponding  interface  under  Microsoft  Excel. 


Figure  3 


The  second  level  is  implemented  under  Hewlett-Packard  NewWave.  It  lets  the  partners 
view  ail  objects  on  each  others'  office  desk.  They  can  even  copy  or  move  objects  to  or 
from  their  partners'  desk.  Of  course,  permissions  for  viewing,  copying,  moving  etc... 
can  be  selectively  assigned  to  all  types  of  objects.  This  feature  complementing  the 
functionality  of  NewWave  is  only  implemented  on  a  PC  under  Microsoft  Windows. 
Figure  4  shows  the  view  of  the  office  desk  of  another  partner. 
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Abstract  -  The  ohjeciives  of  this  paper  are  to  provide  the  context  for  private  sector 
todways'in  Ausiralia.  loexplain  the  irafficforecaslint^meihods  employed,  and  toidentifyihc 
evalubtion  criteria.  Attention  is  focused  on  the  technical  aspects  of  the  traffic  estimation 
.cndevaluation  process  which  takes  into  account  traffic  forecasting  techniques,  traffic 
diversion  and  assignment,  the  temporal  distribution  of  peak-hour  traffic  demands,  vehicle 
operating  costs  and  fuel  consumption,  monetary  values  of  travel  lime  and  di\couiit  rates. 


1.  INTRODUCTION 

Traditionally,  the  main  roads  in  Australia  have  been  constructed  by  state  govern¬ 
ments  through  their  Department  of  Main  Roads.  However,  in  the  state  of  New 
South  Wales  in  recent  years,  the  policy  has  shifted  to  accommodate,  and  then 
encourage,  private  sector  participation  in  the  construction  and  operation  of  road 
facilities.  The  400  million  Sydney  Harbour  Tunnel,  which  has  been  under 
construction  since  early  1989.  is  an  example  where  the  state  government  has 
allowed  the*' construction  of  a  road  facility  funded  by  a  private  consortium. 
Furthermore,  the  Department  of  Main  Roads  (now  Roads  and  Traffic  Authority) 
of  the  state,  of  New  South  Wales  has  called  for  expressions  of  interest  for  three 
privately  funded  toll  roads  -  namely:  the  Buladelah  Tollway,  on  the  Pacific 
Highway,  about  250km  north  of  Sydney;  the  F4  toll  road  in  the  north  western 
sector  of  the  Sydney  metropolitan  area;  and  the  F2  toll  road  in  the  western  fringe 
of  Sydney.  Attracting  private  funds  will  allow  the  Roads  and  Txaffic  Authority,  to 
accelerate  the  construction  of  major  road  projects  in  accordance  with  the  Roads 
20(X)  Plan.  The  total  funds  available  from  both  Commonwealth  Government 
sources  and  State  sources  to  the  Department  of  Main  Roads  in  1986/87,  for 
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ECONOMIC  ANALYSIS  OF  CROSS  HARBOUR  TRANSPORT 

Accuracy  of  the  economic  evaluation  of  toll  roads  depends  on  successful 
forecasting  of  the  level  of  traffic  using  the  transport  facility.  Figure  1  shows  the 
methodology  adopted  by  the  senior  author  in  the  economic  evaluation  of  the 
Sydney  Harbour  Tunnel  project  and  other  alternative  cross  harbour  transport 
proposals.  The  four  lane.  2.4  km  long  tunnel,  now  under  construction,  is. 
conceptually,  a  parallel  facility  to  the  existing  bridge,  which  has  8  road  lanes  and 
two  train  tracks  connecting  the  north  and  south  banks  of  Port  Jackson  and  the 
Parramatta  River(the Sydney  harbour) The  investment  and  operating  costs  will  be 
recovered  by  the  toll  levied  on  traffic  crossing  the  harbour  using  either  the 
(existing)  bridge  or  the  tunnel  (when  opened  in  late  1992).  over  a  35  year  period 

commencing  on  May  1987.  ,  j  *  , 

The  topmost  cell  in  Figure  1  refers  to  traffic  projections  related  to  Average 

Annual  Daily  Traffic  (AADT).  In  1985,  AADT  on  the  existing  Harbour  Bridge 
was  178,180  vehicles  (DMR.  1986a.  b).  DMR  (1986a)  provides  the  following 
traffic  projections  for  the  Sydney  Harbour  Bridge,  based  on  past  traffic  trends. 

Max  Growth  tm..  =  178372  +  3358x 


Min  Growth  Ym.™  — 


200000 
I +0.140*®”* 


where, 


=  estimate  of  AADT  at  maximum  growth  rate. 


Yn,.n  =  estimate  of  AADT  at  minimum  growth  rate;  and 
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Fig.  l:Flow  Diagram  of  Traffic  Forecasting  Methodology  and  Economic  Evaluation  Model  for 
Sydney  Cross-Harbour  Transport 
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Gutteridge,  Haskins  and  Davey  (1986)  -  the  traffic  consultants  to  the  Sydney 
Harbour  Tunnel;  Transfield-Kumagai  Joint  Venture  -  approached  traffic 
projections  from  a  growth  in  southbound  traffic  on  the  Sydney  Harbour  Bridge 
for  Average  Annual  Weekday  Traffic  (AAWT).  The  shape  of  the  mathematical 
function  is  a  logistic  curve  based  on  "a  strongly  linear  historic  trend  and  a  long 
term  growth  constraint,  based  on  a  maximum  Service  Volume,  creating  a 
mathematical  asymptote  for  the  growth  curve  (Gutteijidge,  Haskins  and  Davey, 
1986).  Traffic  projections  for  Average  Annual  Weekday  Traffic  (AAWT)  up  to 
2021  arc  illustrated  by  Cameron  McNamara  (1986a),  the  company  responsible  for 
preparing  the  tun  nel  environmental  impact  statement  (EIS).  These  include  "high", 
"most  likely*and"low"  projections.  For  example,  thelhigh  projections  calculated 
from: 

.  135000 

Yvrr - 

jq.(0  64J5  +  0  028U) 

where 

'^'wT  =estimateof  AAWT  southbound  on  bridge  and  tunnel;  and 

X  =  number  of  years  from  the  tunnel  openning  year  1992. 

The,  "most  likely"  traffic  projection  assumed  great  importance  in  the  final 
appraisal  of  the  project  because  it  formed  the  basis  of  the  guaranteed  revenue 
stream  for  thedevelopers  that  was"underwritten''by  government.  An  independent, 
expert  review  of  the  traffic  forecasts  and  economic  evaluations  was  sought 
from  LInisearch  Ltd  (1987a). 

It  is  at  this  point  that  our  approach  differs  from  thai  of  the  consultants  to  the 
Joint  Venturers,  and  an  original  traffic  model  was  developed.  The  preferred 
approach  is  to  make  projections  of-AADT  and  then  partition  them  into  average 
annual  weekday  traffic(AAWT)and  Average  Annual  WeekendTraffic(AAWE). 
Based  on  time  series  data  of  traffic  volumes  on  the  Sydney  Harbour  Bridge  from 
1968  to  1985,  regression  analysis  leads  to  the  following  relationship: 

Vwt  =  4650  =  0.766Y(r^  =  0.99) 

where, 

Ywe=  estimate  of  average  annual  weekend  traffic  (AAWE);  and 

Y  =  average  annual  daily  traffic  (AADT). 

Average  Annual  DaiIyTraffic(AADT)cqualsfi\c-se\  enths  of  Average  Annual 
Weekday  Taffic  (AAWT)  plus  two  sevenths  of  Average  Annual  Weekend  Traffic 
(AAWE).  By  rearranging,  and  ignoring  public  holidays: 


AAWT  =  7/5(AADT  -  2/7AAWE). 
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For  the  calculation  of  the  costs  and  operational  benefits  associated  with  the 
tunnel  proposal,  three  time  periods  for  each  weekday  are  defined: 

Peak  periods  -  7  to  10  am  and  4  to  7  pm; 

off-peak  periods  -  JO  am  to  4  pm  and  7  to  1 1  pm;  and 

night  period  -  11  pm  to  7  am. 

As  travel  times  are  flow  dependent  it  was  necessary  to  estimate  typical  hourly 
traffic  flows  for  these  three  time  periods  over  the  life  of  the  project  (forevaluation 
purposes  taken  to  be  up  to  2021).  Based  on  historical  data,  regression  analyses  of 
the  temporal  distribution  of  traffic,  as  a  function  of  Average  Annual  Weekday 
Taffic,  leads  to  the  following  equations: 

Yp  =  6.593 (r' =  0.99) 

Yop  =  0.070Ywt'  '‘  (r'=  1.00) 

where, 

Yp  =  estimate  of  peak  period  traffic  volumes; 

Yop  =  estimate  of  off-peak  period  traffic  volumes;  and 
Ywt  =  estimate  of  average  weekday  traffic  volumes. 

To  ensure  the  temporal  distribution  of  traffic  is  properly  constrained  by  the 
total  average  daily  traffic  figure,  the  estimate  of  the  night  period  traffic  volumes  Y  n 
becomes: 

Y„=  Ywt  -  Yp  -  Yop. 

Traffic  assignment  to  a  network  requires  the  rate  of  demand  to  be  established. 
The  hourly  traffic  flows  for  the  three  time  periods  give  the  demand  rate  and  the 
impactofprojected  traffic  volumeson  travel  time  were  estimated  using  Davidson’s 
(1966)travel  time/traffic  flow  relationship.  Thus,  travel  times  at  different  times 
of  the  day  are  computed  firstly  for  the  bridge  only  situation  and  then  for  the 
bridge  with  tunnel  alternative. 

The  bridge  only  base  case  and  the  tunnel  alternative  can  be  compared  in  the 
form  of  four  different  measures:  travel  time  saving;  fuel  consumption;  vehicle 
operating  costs;  and  accident  savings  (Table  1.)  As  mentioned  before,  the  travel 
time  differences  were  based  on  Davidson's  model.  Differences  in  fuel 
consumption  were  computed  using  the  model  reported  by  Bowyer  (et  al.,  1984, 
1985).  Differences  in  vehicle  operating  costs  were  computed  using  published  data 
from  New  South  Wales  Road  Freight  Transport  Industry  Council,  (1986)  for 
trucks  and  Royal  Auto  (July,  1986)  for  motor  vehicles.  Abelson  (1987)  also 
provides  comprehensive  methods  to  compute  vehicle  operating  costs.  Accident 
costs,  which  average  at  I  cent  per  vehicle  kilometre,  were  based  directly  on 
Department  of  Main  Roads  data  (Cameron  McNamara,  1986b),  and  were  applied 
to  the  distance  saving  of  800  metres  via  the  tunnel  made  by  33  per  cent  of  weekday 
and  weekend  traffic.  Tabic  1  sets  out  some  economic  evaluation  parameters:  the 
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tunnel  construction  cost,  its  annual  maintenance  and  operating  cost,  and  the 
monetary  items  for  travel  time,  vehicle  operating  cost,  fuel  saving  and  accident 
costs. 

Table  1.  Summary  of  Economic  Evaluaiion  Parameun.  lor  the  Sydney  Harbour  Tunnel 
Parameter  Value 


Construction  costs 
(limited  clearance  tunnel) 

Annual  operating  /  maintenance  cost 
Weighted  monetary  value  of  time 

Vehicle  operating  cost 
Vehicle  'occupancy 
Vehicle  accident  costs 
Fuel  savings 

Traffic  estimation  ("most  likely") 
for  bridge  and  tunnel 


Benefit  cost  ratios 


(Source:  Cameron  McNamara,  1986b) 

The  basic  economic  evaluation  parameters  given  in  Table  1  were  used  both  by  the 
tunnel  proponents  and  by  Unisearch  Ltd.  The  most  likely  traffic  projection  used 
by  the  consultants  to  the  Joint  Ventures  is  given  in  this  table.  From  these  inputs, 
the  consultants  undertook  an  economic  evaluation  of  the  tunnel  proposal  and 
estimated  the  benefit  cost  ratios  as  ranging  from  1.9  to  0.8  depending  on  the 
discount  rate  adopted. 

The  independent  economic  evaluation  by  Unisearch  Ltd(1987a)also  used  the 
values  in  Table  1  as  inputs  but  applied  the  traffic  model  described  above  to  give  a 
more  accurate  representation  of  the  temporal  traffic  flows  over  the  bridge  and 
tunnel.  A  micro-computer  model  was  developed  to  facilitate  sensitivity  analyses 
for  variations  in  costs  and  benefits  and  in  the  values  of  the  economic  evaluation 
parameters.  This  approach  gave  lower  benefit  cost  ratios  than  those  derived  by  the 
consultants  to  the  joint  ventures  and  given  in  Table  1  . 

The  additional  advantage  of  the  Unisearch  Ltd  approach  was  that  it  allowed  a 


$395  millon 
$7.9  million 

$6.00  per  peak  hour  (1992-1999) 

$7.70  per  hour  (1993  onwards) 

$0.16  per  veh  /  km 
1 .4  persons  /  vehicle 
$0.01  per  veh  /  km 
$0.55  per  litre 

135000 

1-flO 

where, 

Y  =  average  annual  weekday 
traffic  southbound 

X  =  number  of  years  from  the  tunnel 
openning  year  1992 
1.9  at  4  %  p.a.  discount  rale 
1.2  at  7%  p.a.  discount  rate 
0.8  at  10%  p.a.  discount  rate 
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ready  comparison  of  other  cross  harbour  transport  proposals,  such  as  an 
augmented  Sydney  Harbour  Bridge  and  a  cross  harbour  rail  tunnel  (Unisearch, 
1987b).  Figure  2  illustrates  these  alternatives  together  with  the  road  tunnel 
proposal.  The  augmented  bridge  gave  the  highest  benefit  cost  ratio,  primarily 
because  of  its  relatively  lowcapitalcostofS  44  million.  A  rail  tunnel,  together  with 
the  extra  two  traffic  lanes  on  the  Sydney  Harbour  Bridge  (replacing  the  existing 
rail  tracks),  gave  a  benefit  cost  ratio  very  similar  to  that  of  a  road  tunnel. 

In  May,  1 987,  the  go verment  decided  to  proceed  with  the  tunnel  project,  despite 
widespread  media  criticism  of  its  financial  viability.  The  determination  by  the 
Commissioner  for  Main  Roads  used  higher  monetary  values  for  travel  time  than 
those  in  Table  I  ,  and  included  a  salvage  value  for  the  tunnel.  The  tunnel  is  under 
construction  and  is  scheduled  to  open  toroad  traffic  in  September,  1992,  Todate, 
traffic  using  the  Sydney  Harbour  Bridge  is  below  that  forecast  by  the  consultants, 
to  the  joint  Venturers. 


AUGMENTED  PROPOSAL  BRIDGE. 

Rg.  2:  Sydney  Harbour  Tunnel  and  Alternative  proposals. 
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ROUTE  CHOICE  MODELLING 

Whereas  route  choice  modelling  was  not  necessary  in  the  evaluation  of  the 
Sydney  Harbour  Tunnel  because  the  traffic  demands  are  split  between  parallel 
facilities  according  to  Wardrop's  principle  of  an  equal  travel  time  assignment,  it 
wasan  important  technical  fearureof  traffic  modelling  of  both  rural  tollwaysand 
metropolitan  tollroads.  The  governments  principle  in  the  development  of  any  toll 
road  is  that  a*freer  alternative,  route  must  be  available  for  drivers.  Therefore,  a 
route  choice  model  was  developed  by  the  authors  to  forecast  future  level  of  traffic 
on  three  separate  toll  road  proiects  at  various  levels  of  toll  charges.  Price  elasticity 
of  demand  for  cars  and  trucks  are  availablefornumberof  US  toll  facilities,  as 
shown  inTable2.  In  general,  when  the  cost  of  using  the  toll  facility  is  increased,  the 
traffic  volume  is  reduced. 


Table  2.  Price  Elasticity  of  Demand  for  Cars  and  Commercial  Vehicles  on  US  Tollways 


Cars 

Commercial 

Vehicles 

Location 

Toll  Increase 

Elasticity 

Toil  Increase 

Elasticity 

1.  ROADS 

(%) 

(%) 

Pennsylvania 

24 

•0.08 

24 

-0.06 

New  Jersey 

20 

-0.13 

30 

-0.17 

Indiana 

20 

-0.31 

30 

-0.17 

Massachusetts 

30 

•0.18 

30 

-0.17 

Oklahoma  1 

17 

•0.21 

33 

-0.25 

2 

17 

-0.30 

33 

-0.08 

3 

9 

-0.25 

22 

-0.13 

4 

18 

-0.25 

36 

-0.19 

5 

2.  BRIDGES 

11 

•0.31 

44 

-0.12 

Delaware 

20 

-0.26 

43 

-0.25 

Chesapeake  Bay 

15 

-0.15 

IS 

-0.26 

(Source;  based  on  Wuestfeld  andiRegan,  1981) 


The  authors  applied  a  binary  logit  model  to  split  the  total  traffic  volume  in  a 
given  corridor  between  the  toll  road  and  the  non-toll  road.  Ben  Aki va  and  Lerman 
( 1979)  show  the  applicability  of  logit  formulation  for  choice  modelling  purposes. 
The  formula  for  a  binary  choice  logit  model  for  route  assignment  is: 

P(t)  = - i - 

1  +  exp(a  +  P 1  Xi  +  Pzxj  +  PjX)) 

where, 

P(t)  =  probability  of  using  the  tollway; 

X|  =  differences  in  route  distances; 
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X:  =  differences  in  route  travel  times; 

xj  =  differences  in  operating  costs; 

a  =  constant  term;  and 

=  coefncients. 

However,  because  of  the  paucity  of  data  the  three  explanatory  variables  arc 
combined  into  one  standard,  composite,  variable,  called  generalised  cost  -  which 
is  a  combination  of  travel  time  and  cost  -  as  follows: 

P(t)  =  7— -! - 

I  +  exp(px) 

where  x  =  differences  in  generalised  cost. 

The  model  was  calibrated  using  the  data  obtained  from  Sydney  -  Wollongong 
Tollway  (see  Section  2)  in  1987. 

The  above  logit  model  was  applied,  for  example,  in  traffic  estimation  and  appra- 

isalof  a  proposal  for  a  tollway  fromQueanbeyanfnearCanberra,  Australia’s  capital 

city)to  the  South  Coast  of  New  South  Wales.  This  particular  application  was  part 

of  the  research  and  development  of  a  consultancy  project  undertaken  by  all  three 

authors  on  behalf  of  Unisearch  Ltd.  for  a  private  -  sector  consortium.  The 

essential  features  of  the  existing  situation  are  described  as  follows.  The  distance 

from  Canberra  to  Moruya  (on  the  South  Coast)  via  the  Araluen  Valley  is  162  km. 

The  Araluen  Valley  road  is  about  7  m  wide  and  of  gravel  construction.  It  is  a  very 

steep  mountain  road  with  many  tight  curves  and  is  presently  not  a  feasible  route 

for  Canberra  -  South  Coast  traffic.  For  instance,  field  studies  showed  that  driving 

from  Moruya  to  Araluen  on  a  Saturday  morning  in  April  (Autumn),  only  three 

cars  and  two  motorcycles  were  observed.  On  the  other  hand,  Canberra  to  Moruya, 

via  the  Kings  Highway  (Main  Road  51)  and  Batemans  Bay,  is  152  km.  From 

Braidwood  to  Batemans  Bay  the  distance  is  61  km.  with  a  winding  section  of  road 

through  Clyde  Mountain.  During  the  weekdays,  the  traffic  flow  is  light  between 

/ 

Bungendore  (25  km  to  the  east  to  Queanbeyan)  and  Batemans  Bay,  because  two 
thirds  of  traffic  to  and  from  Queanbeyan  leaves  Main  road  51  at  Bungendore, 
with  destinations  to  and  from  Goulburn,  located  to  the  north. 

Over  a  twenty  -  year  period  from  1967  to  1988,  traffic  counts  by  the  Department 
of  Main  Roads,  New  South  Wales,  show  that  the  number  of  vehicles  using  the 
Kings  Highway  has  increased  from  about  10(X)  vehicles  per  day  to  about  3000 
vehicles  per  day.  However,  the  average  traffic  counts  conceal  both  the  seasonal, 
and  weekend,  characteristics  of  traffic.  This  general  level  of  traffic  activity  noted 
above  was  confirmed  by  a  survey  undertaken  by  the  local  government  autority, 
Tullauanda  Shire  Council  in  December,  1987. 

Fourdifferent  alignments  proposed  for  the  toll  road,  and  considered  in  our 
analysis,  are  schematically  shown  in  Figure  3.  In  the  context  of  the  wider  land  - 
use  /  transport  system,  we  observe  that  the  toll  road  concept  is  a  sound  one.  It  is 
consistent  with  current  New  South  Wales  Government  policy  on  roads.  An 
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alternative,  free,  road  would  be  available  to  motorists,  irrespective  of  any  toll  road 
alignment  finally  adopted.  A  toll  road  between  Queanbeyan  and  Moruya  would 
give  genuine  route  distance  saving  to  motorists  (who  currently  use  Kings  Highway 
to  gain  access  to  and  from  the  coast)  from  Goulbum,  Canberra,  and  parts  of 
country  New  South  Wales  (Fig.  4).  The  alignment  would  also  make  travel  from 
Melbourne  via  Canberra  to  the  South  Coast  almost  as  sho^t  as  the  currently 
favoured  route  via  the  Princes  Highway  (782  km  compared  with  742  km). 
Distance  savings  to  road  users  represent  genuine  resource  savings  in  fuel 
consumption  and  vehicle  operation  costs. 


Queanbeyan  n«d  ;140  k« 


Queanbeyan  ExlsUng  road :  167  km 


Moruya 


Moruya 


Existing  road :  90  km 


61  or  69  km 

nc.3  Network  RcpKsentation  of  Toil  Road  Alignments. 
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Fig  4.  Effect  of  Toll  Road  on  Distances  to  Moruya  (South  Coast). 


The  micro-computer-based  traffic  forecasting  model  that  was  developed  allows 
avarieiyof  assumptions  about  toll  road  characteristics  to  be  analysed-length, tolls 
charged,  whether  tolls  are  indexed  or  unindexed,  or  whether  drivers  decide  on 
their  route  choice  because  of  total,  or  perceived,  costs.  Toll  charged,  value  to 
travel  time,  and  fuel  consumption  costs,  are  taken  into  account  in  the  analysis 
based  on  perceived  costs.  In  addition,  vehicle  operating  cost  (which  includes 
depreciation  and  tyre  wear)  is  included  in  the  total  costs  based  method.  The  level 
of  trafficobtained  from  the  total  costs  method  is  generally  less  than  that  obtained 
from  the  perceived  costs  method  because  of  increasedoperating  cost  tor  toll  road 
users.  The  results  of  this  comprehensive  analysis,  usingalogit  model  to  take  into 
account  behavioural  response  of  travellers  for  the  two  toll  road  alignments,  have 
been  presented  elsewhere  (Unisearch  Ltd,  1989). 

A  brief  summary  of  this  comprehensive  analysis  makes  the  following  points. 
Based  on  the  historical  growth  of  traffic  on  the  Kings  Highway,  and  traffic 
projection  by  trend  extrapolation  into  the  future,  it  can  be  shown  that  a  toll  road  is 
not  financially  viable  (given  its  construction,  operation  and  maintenancecosts). 
This  is  summarised  in  Fig  5  for  the  corridor  between  Queanbeyan  and  Batemans 
Bay  and  in  Fig  6  for  the  corridor  between  Queanbeyan  and  Moruya.  The  graphs 
show  the  relative  traffic  levels  for  long  and  short  alignments,  for  perceived  and 
total  costs,  and  for  four  toll  levels($5  for  cars,  $10  fortrucks;  SlOforcars,  $20  for 
trucks; $20for cars,$30fortrucks;  and  $30forcars.  $50for trucks). In  conclusion, 
the  level  of  traffic  on  these  toll  road  alignments,  considerng  the  historical 
growth  of  traffic  alone,  is  insufficient  to  justify  investment. 
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Toll  Level  (For  Cars  and  Trucks) 

Fig  5.  Traffic  Estimates  for  Different  Toll  Levels  -  Corridor  Between  Queanbeyan 
Batemans  Bay  (South  Coast) 
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Toll  Level(ForCarsandTrucks) 

Fig  6.  Tramc  Estimates  for  Dirferent  ToU  Levels  -  Corridor  Between  Queanbeyan  and  Moruva 
(South  Coast)  ^ 
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However,  trend  extrapolation  can  be  a  misleading  technique  for  long-term 
traffic  forecasting,  especially  when  there,  is  substantial  change  -m  the  land-use 
context.  Therefore,  the  second  stage  of  the  analysis  took  a  different  approach.  We 
asked  the  question:  what  level  of  traffic  would  be  required  to  give  a  return  on 
investment  assuming  construction  costs  and  maintenance  costs  of  the  toll  road 
were  known?  This  is  referred  to  as  the  break  -  even  traffic  analysis.  Steps  involved 
in  this  break  -  even  analysis  arc  shown  in  Fig  7. 


Fig  7.  Conceptual  Models  for  the  Estimation  of  Break  -  Even  Traffic  Levels  for  Toll  Road 
Investment. 

The  results  are  summarised  in  Fig.  8  .  The  graphs  show  the  traffic  volume 
required  over  a  period  of  30  years  to  make  the  proposal  financially  viable  (8 
percent  returnon  investment)  for  three  toll  levels($5 -car,  SlO-trucks;  $10- cars, 
$  20  -  trucks;  and  $  20  -  cars,  $30-  trucks).  Assuming  a  toll  level  of  $  10  for  cars, 
the  traffic  levels  required  initially  would  be  in  the  order  of  14000  vehicles  per  day. 


Time  (yean) 

Rg  8.  Average  Annual  Dauy  Traffic  Over  a  30-year  Period  Required  to  Financially  Justify  a 
Toll  Road. 
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The  significance  of  identifying  the  level  of  traffic  required  for  a  break  -  even 
revenue  is  to  determine  the  difference  between  traffic  forecasts  based  on 
historical  trends,  and  the  traffic  required  to  justify,  on  economic  grounds, 
investment  in  new  roads.  This  traffic 'shortfall  represents  the  amount  of  annual 
traffic  that  would  have  to  be^nerated  or  induced  by  new  land-use  developments. 
In  the  context  of  the  Queanbeyan  -  South  Coast  toll  road  proposal,  these 
developments  relate  primarily  to  tourism  and  the  attraction  of  the  coast  both  for 
retired  people  (from  Canberra)  and  for  holiday  home  investment.  Also  considered 
was  the  relationship  between  the  Very  Fast  Train  (VFT)  proposal  linking  Sydney, 
Canberra,  and  Melbourne  (a  fesibility  study  is  due  for  completion  in  1991)  and  its 
influence  on  tourist  traffic,  especially  the  role  of  the  tollway  as  a  "  feeder  service  " 
to  this  new  railway,  the  traffic  results  of  these  scenarios  are  beyond  the  scope  of 
this  paper,  but  nevertheless  they  formed  an  important  part  of  traffic  forecasting 
methodology  that  was  developed  for  the  client. 

CONCLUSIONS 

The  methodology  for  economic  evaluation  and  financial  appraisal  of  transport 
facilities  which  aim  to  attract  private-sector  funding  has  been  described  using  two 
case  studies. The  first  casestudy,  the$  400  million  Sydney  Harbour  Tunnel  project, 
required  the  development  of  a  methodology  that  took  standard  AADT  traffic 
projections  and  separated  them  into  weekday  and  weekend  traffic  and  then  into 
traffic  by  three  periods  of  the  day..  The  traffic  estimation  procedure  provided  the 
necessary  input  to  the  application  of  Davidson’s  travel  time/flow  model  to  transform 
the  traffic  projections  into  level  of  service  measures,  such  as  travel  time,  fuel 
consumption,  vehicle  operating  costs,  and  frequency  of  accidents.  Standard 
economic  evaluations  (benefit  cost  ratio,  internal  rate  of  return,  and  net  present 
value)  were  performed  based  on  the  predicted  variations  of  the  above  measures 
under  a  number  of  alternative  transport  improvement  strategies.  A  flexible  micro 
-  computer  program  allowed  sensitivity  analyses  to  be  readily  undertaken. 

The  second  case  study  of  a  proposed  toll  road  project  in  rural  NewSouthWales 
was  introduced  to  demonstrate  the  application  of  the  logit  model  to  account  for 
the  manner  in  which  road  users  weight  up  the  costs  and  benefits  in  choosing 
between  alternative  routes.  The  logit  model  was  applied  to  estimate  the  potential 
traffic  share  on  a  toll  facility.compared  to  the  total  traffic  volume  of  the  corridor, 
given  assumptions  about  route  lengths,  speeds  and  toll  charges.  This  particular 
case  study  wasalso  used  to  demonstrate  the  break-even  traffic  analysis  developed 
toesumatethe  annual  level  of  traffic  required  over  the  project  life-time  to  attract 
private  investment. 

This  paper  has  explained  the  research  and  development  underpinning  major 
independent,  expert  consultancy  advice  to  government  and  to  the  piivate  sector 
by  the  University  of  New  South  Wales  R  &  D  Company,  Unisearch  Ltd.  Although 
economic  evaluation  has  become  standard  practice  in  road  project  appraisal  there 
are  nevertheless  technical  issues  in  the  quantification  of  benefits  (and  also  of 
environmental  costs  that  have  not  been  addressed  in  this  paper).  Quantification  of 
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benefits  requires  accurate  traffic  estimation  because  the  level  of  facility  use  will  be 
an  important  factor  in  total  user  benefits  and  in  the  financial  viability  of  any 
project  financed  on  a  user-pays  principle.  In  Australia,  as  direct  experience  with 
tollways  is  limited,  as  outlined  in  Section  2, and  our  search  through  the  Australian 
literature  failed  to  discover  any  suitable  methodology,  it  was  necessary  for  the 
authors  to  develop  original  traffic  models  as  one  part  of  the  overall  economic  and 
financial  eraluation  process,  and  these  has  been  explained  in  technical  detail  inthis 
paper.  To  date  no  methodological  work  on  the  private-sector  toll  road  proposals 
in  New  South  Wales  has  been  published,  and  so  this  work  is  presented  to  stimulate 
discussion  about  the  methodology  to  improve  the  accuracy  of  traffic  modelling 
and  forecasting  exercises. 
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Abstract:  The  talk  gives  an  introduction  into  the  OpTtX-lI  DSS  for  the  modelling  and  parallel  solution  of  nonlinear 
optimization  problems  which  arise  especially  in  the  fields  of  engineering  design  and  production  planning.  The  DSS 
supports  all  steps  from  the  formulation  of  nonlinear  optimization  problems  to  the  solution  on  parallel  computers. 
Thereby  OpTIX-Il  provides  an  engineer  /  decision  maker  with  the  knowledge  of  an  optimization  /  computer  expert  in 
form  of  software.  In  order  to  reduce  the  overall  computing  time  and  to  improve  the  quality  of  the  solution  obtained, 
much  emphasis  has  been  placed  on  decomposition  principles  and  nonsequential  solution  approaches  in  mathematical 
optimization. 

1.  Introduction 

OpTiX-ll  is  an  interactive  decision  support  system  for  the  solution  of  nonlinear  optimization 
problems  which  arise  especially  in  the  fieids  of  engineering  design  and  production  planning.  The 
OpTiX-II  software  environment  supports  an  engineer  /  decision  maker  with  the  knowledge  of  an 
optimization  /  computer  expert  in  the  form  of  software.  It  makes  use  of  modem  computer  technol¬ 
ogy  (e.g.  computer  networks,  parallel  computers)  for  faster  and  more  reliable  problem  solutions 
and  supplies  an  easy-to-use  graphical  interface  for  untrained  computer  users,  in  order  to  reduce 
the  overall  computing  time  and  to  improve  the  quality  of  the  solutions  obtained,  much  emphasis 
has  been  placed  on  decompositon  principles  and  nonsequential  solution  approaches  in  mathemat¬ 
ical  optimization.  Parallelism  in  the  solution  of  nonlinear  optimization  problems  can  be  exploited 
at  several  levels: 

(I)  First,  the  application  of  decomposition  techniques  and  multi-level  optimization  strategics 
leads  to  Isi  level  optimizauon  subproblems  and  a  2nd  level  coordination  problem.  These 
decomposed  optimization  problems  are  solved  by  primal  decomposition  methods  (feasible 
method)  or  dual  decomposition  methods  (non-feasiblc  method).  In  both  cases,  the  solution  of 
the  1st  level  optimization  subproblems  is  well  suited  for  a  coarse  grained  parallel  computa¬ 
tion.  Therefore  OpTiX-II  may  distribute  these  subsystem  optimizations  onto  a  network  of 
heterogenous  Unix-workstations  or  parallel  MIMD-typc  computers. 

(H)  Secondly,  parallel  implementations  of  the  classical  algorithms  based  on  the  Lagrange-  or 
Kuhn-Tucker  theory,  make  use  of  fine  grained  parallelism.  This  leads  to  a  high  communica¬ 
tion  effort  between  the  processors  involved  in  the  computation.  Therefore,  it  is  advisable  to 
use  strongly  coupled  parallel  computers  with  a  high  communication  bandwidth,  e.g.  shared 
memory  multiprocessor-systems.  Currently  OpTiX-lI  supports  some  parallel  implementa¬ 
tions  of  classical  algorithms  on  shared  memory  multiprocessor-systems  using  the  Unix  oper¬ 
ating  system. 

(Ill)  In  their  mathematical  description,  practical  nonlinear  optimization  problems  frequently  con¬ 
sist  of  highly  nonlinear  objective  functions  and  constraints.  In  these  cases  assumptions  about 
unimodality,  convexity,  and  smoothness  of  well-known  solution  methods  in  nonlinear  opti¬ 
mization  are  mostly  invalid.  OpTiX-Il  allows  to  apply  a  simultaneous  combination  of  differ¬ 
ent  optimization  algorithms  to  one  optimization  problem.  Thereby,  the  controlled  information 
exchange  between  the  participating  and  parallel-running  methods  is  the  basis  for  a  more  reli¬ 
able  and,  in  some  cases,  even  faster  solution. 
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Work  with  OpTiX-II  consists  of  three  phases:  During  the  problem  formulation  phase  (I),  the  user 
dehnes  the  precise  mathematical  formuladon  of  the  optimization  problem  under  analysis  (see 
Section  2).  This  formulation  is  then  translated  GO  into  a  machine  representadon,  which  is  suitable 
for  parallel  processing  in  heterogenous  networks  (see  Section  3).  The  third  phase,  used  for  the 
solution  of  the  optimization  problem,  is  described  in  Section  4.  Within  this  step  GII)>  the  user  has 
to  define  a  optimization  strategy,  by  choosing  a  combination  of  optimization  algorithms,  and  he 
has  to  start  the  optimization  process. 

2.  Problem  Formulation  Phase 

This  phase  is  supported  by  the  OpTiX-II  Edit-Environment  (fig.  1),  which  is  being  used  for  for¬ 
mulating  the  optimization  problem,  for  controlling  the  generation  of  optimization  servers  for  dif¬ 
ferent  platforms  and  for  starting  the  execution  environment. 

The  problem  description  is  entered  into  a  graphically  controlled  problem  editor  using  the  OpTiX- 
II  problem  description  language  which  resembles  the  mathematical  notation  for  nonlinear  optimi¬ 
zation  problems  (fig.  2).  In  many  practical  situations,  complex  optimization  problems  can  not  be 
described  by  simple  mathematical  notation.  Therefore,  OpTiX-11  allows  the  inclusion  of  external 
functions  written  in  the  programming  languages  C  or  Fortran  (fig.  1).  Calls  to  these  functions  may 


load  a  probleni  save  a  compile  the 

description  problem  description  problem  description 

\  /  into  C  jputines 


name  of  the  file 
loaded  into  the 
text  editor 


( )  ( iru  lin  t  (  >«in  ( laoMiiwBiiioii) 


./tMiwpotAof 


VM  til*  wlwtear  t»  94k  Ik*  prtkl»» 

IlKlUdl  .C: 

/*  R»s«n/}iauki  »ro»l*t 

29m.  J.  In;  Sthlttkmkl .  K.  Cwawtatlm) 

Mlh*ntt<«I  krvtrtMtny.  m.  Sn-SM.  ^trlnpr.  ml 
f(w*t)  •  -44:  (0.1. 2.-1) 


rMivar  il  .K2.K2.v4.f; 


text-editor  for 
entering  problem 
descriptions 


start  the 

execuli^  environment 

iT"*  the  Edit-  and 
■  ■  ■■]  Compile  environment 


Compilation  of  the 
subroutines  generated 
by  the  compiler  and 
generation  of  optimization 
servers  for  all  supported 
platforms. 


Ckulld  wcuttbUt 
Comollt  all  addltioAtl  sourots 

Compll*  Mitettd  tdd.  lOMTOS 

Popup  Sutup  Window 
CopyrIfM 


«  «rt«OKl.  Vof«n.«0*lif*.  Ik*)  )•(•)  «2.  Ik*)  *).  )k*)  »4>; 


4«Wi*Pir*tM: 
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Unix  command 
shell. 

Control  output 
from  the  compiler 
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Fig.  1 :  The  OpTiX-lI  Edit-Environment. 
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become  a  subexpression  of  an  objective  function  or  a  constraint.  Even  commercially  available 
simulation  packages  for  the  solution  of  complex  mathematical  models  may  be  included  in  such  a 
way.  Other  language  constructs  allow  the  formulation  of  decomposed  optimization  problems  (fig. 
2)  so  that  multi-level  optimization  strategies  may  be  applied. 

3.  Problem  IVanslation  Phase 

This  phase  is  controlled  by  the  „Compile“  and  „Build  executables  “  menus  from  the  Edit-Environ¬ 
ment.  The  selection  of  the  ..Compile'*  button  starts  the  OpTiX-lI  problem  compiler,  which  trans¬ 
lates  the  problem  description  into  a  collection  of  functions  written  in  the  computer  language  C. 
Thereby,  the  compiler  calculates  symbolic  first  and  second  order  derivatives  for  all  objective 
functions  and  constraints.  This  ensemble  of  C-functions  is  then  compiled  and  linked  into  optimi¬ 
zation  servers  for  different  hardware/software  platforms  (e.g.  SPARC/Solaris.  MIPSAJltrix, 
Transputer/Helios).  These  optimization  servers  arc  called  from  within  the  execution  environment 
(in  phase  III).  This  approach  allows  parallel  optimization  in  heterogeneous  computer  networks. 


/*  gear  reducer  decomposed  */ 
realvar  X  I,x2,]t3,x4,x3,x6.s7,r,fl,f2; 

problem 

subsystem  '■shaft_and_bearing$_l " : 

fl  =  min  -1  J08»xl»sqr<x6)  +  7.477*x6A3  +  0.7854»x4»sqr<x6); 
decisionvar  x4.x6; 
constraints 

r  g3  •/  1.93/x:/x3*x4^3/x6A4  <=  I; 

/•  gS  •/sqrKs<)r<745*x4/x2/x3)+16.9E6)A).l/x6^3  <=  1100; 
/•g24*/(1.5*x6+1.9)/x4  <=  1; 

bounds 

/•gl6.gl7»/7.3  <=  x4  <=S.3; 

/»g20.g21  •/2.9<=x6<=3.9; 
endsubsystem: 


subsystem  '‘Gear_Reduoer_2nd_lever’: 

f  =  min  -1  J08*xl*sqT(x6)  +  7.477*x6''3  +  0.7854*x4*sqT(x6) 

-1  J08*xl*sqi<x7)  +  7,477*x7''3  +  0.7854»x5»x7''2 
+0.7854*xl»sqr(x2)*(3.3333*$<ii<x3)+14.9334»x3 -43.0934); 
decisionvar  xl,x2,x3; 
constraints 

/•gl  •/27/xI/sqt(x2Vx3  <=  I; 

/•  g2  •/  397J/xl/sqr<x2)/sqr(x3)  <=  1; 
r  g7  •/  x2*x3  <=  40; 

/•g8  •/xl/x2  >=5; 
rg9Vxl/x2<=12; 

bounds 

/•gl0.gll  •72.6  <=xl  <=3.6; 

/•gl2.gl3*/0.7  <=  x2  <=  0.8; 

/•gI4,gl5*/17<=x3<=  28; 
endsubsystem; 
endproblem; 


Fig.  2:  Example  of  an  OpTiX-11  problem  description  for  a  decomposed  nonlinear  optimization 
problem  ((Azann90],  [Golinski70]). 


subsystem  “shaft_and_bearings_2'  ; 

f2  =  nun  -1  J08*x  1  ‘snita?)  +  7.477*x7a3 
+  0.7854*x5*x7A2; 
decisionvar  x5,x7; 
constraints 

r  1**1  \  .93/x2/x3»x5a3/x7a4  <=  1; 
t*  g6  •/ sqrt(sqr(745*x5/x2/x3>+157.5E6) 
/0. 17x7*3  <=  850; 

/•g25  •/ (I.l*x7+1. 9)7x5  <=  1; 

bounds 

/•gl8.gl9V  7J  <=  x5  <=8.3; 

7*  g22.g23  *75.0  <=  x7<=5.5; 
endsubsvsiem; 
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Cunently  the  problem  translation  phase  generates  code  fon 

(i)  Unix  wtnicstations  from  different  manufacturers. 

(ii)  MIMD  computers  with  shared  memory: 

-  multiprocessor  Unix  woricstatkms/servers,  e.g.  Sun  SPARCstation  10  and  600 
series.  Sun  SPARCcenter  2000  series. 

(iii)  MIMD  computers  with  distributed  memory: 

-  Tiansputerclusters. 

-  Workstation  networks,  regarded  as  loosely  coupled  multiprocessor-system. 


4.  Problem  Solution  Phase 

The  OpTiX-II  execution  environment  is  used  for  controlling  the  ongoing  optimization  process.  It 
distributes  the  computations  onto  NFS-based  heterogeneous  computer  networks,  multiprocessor 
workstations  and  transputer-based  parallel  computers.  Furthermore,  the  Execution-Environment 
records  the  problem  solution  process  and  displays  the  results.  The  user  interacts  with  the  control 
trxxiule  of  the  Execution-Environment  (tig.  3).  This  user-interface  coiresponds  to  the  Open  Look 
standard  and  is  completely  interactive.  In  the  simplest  case  of  a  non-decomposed  optimization 
problem,  the  user  selects  an  optimization  algorithm  from  the  algorithms  list,  a  host  for  execution 
from  the  hosts  panel,  the  „add“  option  from  the  „edit“-menu,  and  presses  the  start  button.  The 
optimization  results  are  displayed  in  the  control  module  window  (tig.  3).  After  each  computation, 
the  user  may  select  another  algorithm  and  continue  the  optimization  by  pressing  the  continue  but¬ 
ton.  For  difticult  or  decomposed  optimization  problems,  the  user  has  to  define  a  more  complicated 
strategy  script,  defining  the  optimization  steps  that  have  to  be  taken.  In  each  step,  the  user  may 
combine  the  following  strategies: 

(i)  In  the  case  of  decomposed  optimization  problems,  all  subsystem  optimizations  can  be  run  in 
parallel,  reducing  the  overall  computational  tinne  effort.  The  user  may  choose  this  strategy  by 
selecting  a  3-tupel  (subsystem,  algorithm,  host  for  computation)  for  each  subsystem  within 
the  control  module  (fig.  3). 

(ii)  The  user  may  apply  a  parallel  optimization  method  (to  a  subsystem  optimization),  if  a  paral¬ 
lel  computer  with  shared  memory  is  available. 

(iii)  A  simultaneous  combination  of  different  optimization  algorithms  to  one  optimization  (sub)- 
system  may  be  applied.  In  this  situation,  the  controlled  information  exchange  between  the 
participating  and  parallel-tunning  methods  is  the  basis  for  a  more  reliable  and,  in  some  cases, 
even  faster  solution  ([Boden91a],  [Boden91b]).  This  approach  is  similar  to  hybrid  optimiza¬ 
tion  methods  described  in  fButdakov88]  and  [Kleinmichel92].  Their  idea  is  to  define  tests  for 
switching  between  a  globally  convergent  method  I  and  a  locally  superlinearly  convergent 
method  II  in  order  to  obtain  a  globally  and  locally  superlinearly  conveigent  method  (fig.  4). 
The  OpTiX-II  user  may  apply  both  methods  in  pa^lel  on  different  computing  nodes.  After  a 
user-definable  number  of  iterations,  the  best  value  of  both  strategies  is  selected  and  used  as  a 
basis  for  further  computations. 

The  basic  control  unit  in  OpTiX-II  is  a  block  (fig.  5)  that  consists  of  a  sequence  of  optimization 
steps,  each  using  the  strategies  described  in  (i)  to  (iii).  By  the  use  of  several  blocks,  in  parallel,  the 
calculations  may  be  started  from  different  initial  points.  Thereby,  problems  resulting  from  multi¬ 
modality,  nondifferentiability,  and  nonconvexity  of  the  feasible  domain  may  be  overcome. 

For  the  solution  of  decomposable  nonlinear  optimization  problems  the  user  may  ^ply  priiiud 
decomposition  methods  (resource  division,  principle  of  interaction  prediction,  feasible  metlKxl)  or 
dual  decomposition  methods  (objective  function  modification,  non-feasible  method).  In  both 
cases,  the  user  has  to  define  first  arid  second  level  (coordination)  problems  within  the  problem  edi¬ 
tor.  He  then  defines  a  control  strategy  consisting  of  a  block  with  two  steps.  In  the  first  step,  he 
selects  at  least  one  2-tupel  (algorithm,  host  for  compuution)  for  each  first  level  subsystem.  There- 
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Fig.  4:  The  coupling-scheme  within  hybrid 
optimization  methods  [Klein- 
michel92]. 


after,  in  the  second  step,  he  chooses  at  least 
one  2-tupel  (algorithm,  host  for  computa¬ 
tion)  for  the  second  level  (coordination) 
problem.  This  block  is  executed  several 
times  until  convergence  is  obtained. 

The  user  may  set  up  experiments  and  adapt 
the  program  to  an  optimal  sequence  of  (par¬ 
allel  running)  algorithms  for  his  type  of 
nonlinear  constrained  optimization  prob¬ 
lem. 
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Transp.  Eng. ,  Technical  University  of  Budapest.  Hungzu'y) 

Extended  abstract 

Consider  the  gradually  more  and  more  complex  problems 

of  single  row  routing,  chauinel  routing  zind  switchbox  routing 
on  the  one  hand;  amd  the  gradually  less  amd  less  restrictive 
models  (1-layer,  Manhattaui,  unconstrained  2-layer, 

multilayer)  on  the  other  hauid.  The  single  row  routing 
problem  can  always  be  solved  in  the  Manhattain  model,  amd  the 
channel  routing  problem  cam  always  be  solved  in  the 
unconstrained  2-layer  model,  in  fact,  both  in  lineam  time. 
We  show  that  the  switchbox  routing  problem  is  solvable,  even 
in  linear  time,  in  the  multilayer  model. 

I. 

A  switchbox  is  a  rectamgulair  grid  G  of  horizontal 
tracks  (numbered  from  0  to  w+1 )  and  vertical  columns 
(numbered  from  0  to  h+1),  where  w  and  h  are  the  width  and 
the  height  of  the  switchbox.  The  boundamy  points  of  G  ame 
called 

-  Northern  if  their  coordinates  are  of  form  (i,  w+1)  with 

i=l,2 . h; 

-  Southern  with  form  (i,0)  where  i=l,2, . . . ,h; 

-  Eastern  with  form  (h+1, j)  where  j=l,2 . w;  amd 

-  Western  with  form  (0,J)  where  j=l,2 . w. 

For  example.  Figure  1  is  a  switchbox  with  width  4  amd  height 
5,  where  Northern,  Southern,  Eastern  amd  Western  boundary 
points  are  denoted  by  x's,  plus  signs,  empty  and  solid  dots, 
respectively.  The  "corners"  of  G  will  not  be  considered  as 
boundamy  points, 

A  net  is  a  collection  of  boundary  points.  A  switchbox 
routing  problem  (SRP)  is  a  set  of  pairwise  disjoint  nets.  If 
every  boxmdary  {Mint  of  every  net  is  Southern,  the  SRP  is 
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called  single  row  routing  problem.  If  they  sire  all  Southern 

or  Northern,  the  SRP  Is  called  chzuinel  routing  problem 

(CRP).  We  shall  also  use  the  expression  open  box  rout  1  ng 

problem  (OBRP)  If  there  is  no  Eastern  boundary  point. 

A  CRP  Is  called  bipartite  if  every  net  consists  of  one 

Northern  and  one  Southern  boundary  point.  An  SRP  is  called 
4-p2u*tlte  If  every  net  consists  of  one  of  each  four  types  of 

boundary  points.  Finally,  we  call  an  OBRP  S-partlte  if 

every  net  consists  of  one  Northern,  one  Southern  and  one 
Western  boundary  points  (hence  no  Eastern  boundary  point  is 
contained  in  ary  net). 

The  solution  of  a  routing  problem  in  the  single  layer 

model  (SLM)  is  the  realization  of  the  nets  as  pairwise 

vertex  disjoint  subgraphs  (usually  Steiner  trees)  of  the 
planar  grid  graph  G  so  that  each  subgraph  connects  the 
boundary  points  of  the  net.  The  edge-d is Joint  single- layer 

model  (EDM)  is  defined  in  the  same  way  except  that  the 

subgraphs  must  be  pairwise  edge  disjoint  only.  For  example. 
Figure  2  shows  the  solution  of  an  SRP  in  the  SLM,  while  the 
two  SRP's  of  Figure  3  cannot tc solved  in  the  SLM,  only  in  the 
EDM. 

The  unconstrained  k-layer  model  (UkM)  requires  pairwise 

vertex  disjoint  subgraphs  of  the  k-layer  grid  graph  G^. 

Edges  of  these  subgraphs  Joining  adjacent  points  of  two 
distinct  layers  are  called  vias.  In  cEise  of  k=2  vias  eu'c 

also  called  via  holes  but  one  should  not  imaigine  them  as 
holes  if  k>3  since  situations  like  that  of  Figure  4  au'e  also 
possible  (segmented  or  stacked  vias,  see  Mueller  and 

Mlynski,  1988  or  Lengauer,  1990,  respectively). 

The  multilayer  models  may  be  constrained.  If  we  have 
two  layers  and  one  of  them  is  restricted  to  horizontal  wire 
segments  and  the  other  is  restricted  to  vertical  ones  then 
we  obtain  the  Manhattan  model  (MM).  For  exeunple,  both  SRP's 

of  Figure  3  can  be  solved  in  the  U2M  but  only  the  second  one 
can  be  solved  in  the  MM. 

Finally  let  us  emphaisize  that  the  "corners"  of  the  grid 
graph  G  (or  any  copy  of  them  in  G  )  must  not  be  used  in  the 

k 

routing.  Similarly,  the  solution  of  a  single  row  routing 
problem  must  not  use  Eaistern,  Western  or  Southern  boundary 
points,  that  of  a  CRP  must  not  use  Eaistern  and  Western  ones, 

and  that  of  an  OBRP  must  not  use  Eastern  ones. 


II. 


Every  single  row  routing  problem  can  be  solved  In  the 
MM,  In  fact.  In  linear  time.  This  observation  is  probably 
due  to  T.  Gallai  ajid  it  belongs  to  the  engineering  folklore 
since  decades.  Similarly,  every  CRP  cam  be  solved  in  the  U2M 
{ Marek-Sadowska  and  Kuh,  1983)  and  even  a  linear  time 
algorithm  is  known  (Recski-Strzyzewski,  1990).  However,  - 
while  Gallai ‘s  algorithm  realizes  the  problem  with  minimum 
width,  our  algorithm  does  not,  ard  the  computational 
complexity  of  deciding  whether  a  CRP  can  be  solved  in  the 
U2M  with  a  given  width  seems  to  be  opien,  see  Johnson,  1984 
and  Recski,  1992  as  well. 

Hence  a  natural  question  arises:  cam  we  solve  every 
OBRP  or  every  SRP  in  the  UkM  with  a  sufficiently  lairge  k? 
The  amswer  is  negative,  as  shown  by  the  SRP  of  Figure  5, 
essentially  due  to  Haunbrusch,  1985.  If  the  pairs  of 
identical  niambers  ame  the  nets  then  the  congestion  of  the 
dotted  line  is  h+n.  In  case  of  i  layers  this  clearly  means 
tw£h+w,  leading  to  the  lower  bound  -+1  for  the  number  of 

layers.  Thus  the  number  of  necessary  layers  can  be 
arbitrarily  high  if  we  allow  very  thin  or  very  wide 
rectangulars. 

However,  suppose  that  the  quantity 


is  fixed  (essentially,  bounded  from  above).  Let  s  denote  the 
number  of  those  sides  of  the  boaird  which  contain  terminals 
at  all,  i.e.  let  s=l  for  the  single  row  routing  problem, 
s=2,3  amd  4  for  the  CRP,  OBRP  and  SRP  respectively. 

Theorem  1  There  is  a  function  i, -t  (m,s)  such  that  any 

O  O 

problem  characterized  with  m  amd  s  can  be  solved  in  linear 
time  in  the  UfM  for  . 

O 

In  particular,  we  conjecture  i  (l,s)=s  if  s>l.  Right 

O 

now  we  cam  prove  the  following  very  special  case: 

Theorem  2  t^(l,s)=s  in  the  s-partlte  caise  (s>l). 

Details  of  the  proofs  and  algorithms  will  be  published 
in  the  full  paper.  Routing  exaunples  ame  shown  in  Figures  6 
amd  7  {m=l,  s=2  and  m=l,  s=4,  respectively).  Wires  in  the 
four  layers  (in  this  order)  ame  shown  by  heavy,  thin,  broken 
and  dotted  lines,  respectively. 


III. 

Our  main  conjecture,  £  (1,4)=4  seems  to  be  unrelated  to 

O 
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the  result  of  Brady  and  Brown  (1984),  apart  from  Its  title 
"Four  layers  suffice"  because  there  the  authors  show  that  a 
realization  of  the  SRP  in  the  EDM  can  always  be  transformed 
to  the  U4M  but  such  an  EDM  solution  need  not  exist.  Since 
deciding,  whether  such  a  transformation  from  EDM  to  U3M  is 
possible,  is  known  to  bs  NP-complete  (Lipski,  1984),  it  is 
reasonable  to  conjecture  that  the  problem  to  decide  if  an 
SRP  with  h=w  can  be  realized  in  the  U3M,  is  also 
NP-complete. 
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Abstract 

We  present  a  new  approach  to  evaluating  lower  bounds  for  a  class  of  quadratic 
assignment  problems  (QAP).  An  instance  of  a  QAP  of  size  n  is  specified  by  two  n  x  n 
matrices  D  and  F  and  we  denote  such  an  instance  by  QAP(D,  F).  Our  approach  is 
applicable  to  problems  where  the  matrix  D  is  derived  as  rectilinear  distances  between 
points  on  a  regular  grid.  We  construct  two  matrices  Foft  and  /V»»  such  that  F  = 

+  F'rt!  and  the  optimal  solution  to  QAP(D,Fopt)  is  known.  Any  existing  lower 
bound  can  then  be  applied  to  QAP(D,  f,,,),  which  in  sum  with  the  optimal  value  for 
QAP(D,Ftpi)  provides  a  valid  lower  bound  to  QAP(D,F).  This  approach  results  with 
improved  lower  bounds  for  some  QAPs  from  the  literature. 

1.  Introduction  a  quadratic  assignment  problem  (QAP)  of  size  n  is  specified  by 
two  n  X  n  matrices  D  and  F.  Denoting  by  IT  the  set  of  all  permutations  of  {1,2, 
the  problem  can  be  defined  as  min,^n<^{’'')  =  QAPs  have  numerous 

applications  including  facility  location,  backboard  wiring  and,  scheduling.  For  a  compre¬ 
hensive  survey  of  QAPs  the  reader  is  referred  to  a  paper  by  Finke  et  al.  [1].  In  the  context 
of  facility  location,  the  matrix  D  is  thought  of  as  the  matrix  of  distances  between  locations, 
and  the  matrix  F  is  thought  of  as  the  matrix  of  flow  or  interaction  between  facilities. 

Two  of  the  main  existing  lower  bounds  for  the  QAP  are  Gilmore- Lawlor  bounds  (GLB)  [2, 4) 
and  eigenvalue  bounds  [3,  8].  We  propose  a  novel  approach  to  the  problem  of  computing 
lower  bounds  for  QAPs.  Our  approach  is  applicable  to  QAPs  whose  matrix  D  is  composed 
as  rectilinear  distances  between  points  on  a  regular  grid.  All  of  Nugent’s  problems  (6)  of 
sizes  between  5  and  30,  one  problem  of  size  36  due  to  Steinberg  [11]  and  problems  of  sizes 
between  42  and  100  due  to  Skorin-Kapov  (10,  9]  fall  under  this  category. 

Our  approach  for  calculating  lower  bounds  for  the  QAP  starts  with  an  initial  identification 

‘Depatlmeni  of  Applied  Mathematics  and  Statistics,  State  University  of  New  York  at  Stony  Brook,  Stony 
Brook,  NY  11794  email:  chakrapaOams.sunsyb.edu 

'Harriman  School  for  Management  and  Policy,  State  University  of  New  York  at  Stony  Brook,  Stony 
Brook,  NY  11794  email:  jskorinOccvm.sunysb.edu.  All  correspondences  to  be  addressed  to  this  author. 
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of  and  such  that  F  =  and  the  optimal  solution  to  F^l^  is  known. 

Transformations  preserving  optimal  permutation  are  then  applied  to  to  obtain  Fop< 
and  Frt,  such  that  F  =  Fopt  +  Fn,-  One  of  the  transformations  we  use  is  similar  to  the 
one  proposed  by  Palubetskes  (7)  to  generate  QAPs  with  rectilinear  distance  matrix  and 
known  optimal  solution.  For  a  QAP  specified  by  matrices  A  and  B,  define  by  opt(A,B) 
(resp.,  by  lb{A,B))  the  minimal  objective  function  value  (resp.,  the  lower  bound  on  the  op¬ 
timal  objective  function  value).  Clearly  then,  opt(D,F)  >=  opt{D,  Fopt)  ■i-opt{D,  Fret)  >  = 
opt(D,Fopt)  -f  lb{D,Fre,),  and  the  last  expression  is  a  valid  lower  bound  for  the  initial 
QAP.  Any  of  the  existing  bounds  from  the  literature  can  be  applied  to  obtain  lb(D,  Fret), 
and  therefore  our  method  could  serve  as  a  preprocessing  step  to  possibly  tighten  existing 
bounds.  We  formulate  the  construction  of  F^p,  and  as  a  linear  programming  problem 
which  we  refer  to  in  the  sequel  as  LPLB.  The  sequel  also  states  our  results  without  proofs 
for  the  sake  of  brevity. 

2.  Constraints  of  LPLD  The  first  step  to  our  bounding  method  is  to  generate 
a  QAP  instance  with  known  optimal  permutation.  For  convenience  we  use  the  identity 
permutation  x/  as  the  optimal  permutation. 

QAP  with  X/  as  optimal  permutation:  Consider  a  QAP  where  all  the  entries  of  one 
of  the  matrices,  say  F,  equal  a  constant  Denoting  by  d,„m  =  ^^"=1  ^he  sum 

of  all  entries  of  the  matrix  D,  it  can  be  easily  shown  that  for  such  a  class  of  QAPs  every 
permutation  (including  x/)  is  optimal,  and  the  objective  function  always  evaluates  to  (d.um- 
Transformations  preserving  optimal  permutation:  Let  x/  be  the  optimal  permuta¬ 
tion  to  a  QAP.  We  present  two  types  of  transformations  that  preserve  the  optimal  permu¬ 
tation  when  applied  to  the  flow  matrix.  Both  transformations  are  elementary  and  the  first 
is  due  to  Palubetskes  [7]. 

Let  the  rectilinear  distance  matrix  D  be  formed  from  an  r  X  c  (n  =  rc)  grid  of  points.  Each 
location  is  then  specified  by  its  coordinates  in  the  grid.  Let  «  =  (r, •»<:,)  and  j  =  (rj,Cj)  be 
two  locations  1  <  r,,rj  <  r,  1  <  c,,c,  <  c.  Let  k  =  (r*,c*)  be  another  location.  We  say 
that  k  is  in  the  path  between  t  and  j  if  r*  lies  between  r,  and  r^,  and  c*  lies  between  c,- 
and  Cj.  Note  that  if  k  is  in  the  path  between  t  and  j,  d,^  =  d,*  +  djk-  The  main  result  of 
Palubetskes  (7)  uses  this  property  of  rectilinear  distances  along  with  the  validity  of  triangle 
inequality. 
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Definition  1  Let  a  be  a  positive  scalar.  V  m  >  1,  •  •  -kmjiOt)  is  an  n  x  n  matrix 

such  that  the  entries  iky,  k^kj,  *2^:3 . k^j  equal  a,  the  entry  ij  equals  -a,  and  the  rest 

of  the  entries  are  0. 

Definition  2  Let  a  be  a  positive  scalar.  A‘^{ikj,a)  is  an  nxn  matrix  such  that  the  entries 
ik,  kj  and  ij  equal  a,  and  the  rest  of  the  entries  are  0. 

Lemma  1  (Palubetskes)  Let  D  be  the  rectilinear  distance  matrix  and  let  T  be  the  flow 
matrix  for  which  xi  is  optimal.  Let  i,ki,k2, . .  ■  ■,km,j  be  m  +  2  locations  (m  >  1)  such 
that  ki  is  in  the  path  between  i  and  j,  and 'd  2  <  I  <  m  k{  is  in  the  path  between 
and  j.  Define  =  -^  +  A* (1*1  •  •  •  k„j,a).  jt/  is  still  optimal  for 

the  optimal  objective  function  value  unchanged. 

Lemma  2  Let  D  be  the  rectilinear  distance  matrix  and  let  T  be  the  flow  matrix  for  which 
xj  is  optimal.  Let  i,k,j  be  three  locations  such  that  dik  =  dkj  —  1  {flij  =  2).  Define 
T^kja  =  -^  +  ^]kjo  optimal  for  J^}k,.a  optimal  objective  function 

value  of  opt{D,J^)  +  4a. 

Formally  we  define  the  transformations  as  below. 

Definition  3  Tl(ilbi  .  ..kmj,Q)  is  the  transformation  due  to  the  addition  ofA^{ik\ . . .  kmj,ct) 

to  the  flow  matrix,  where  the  locations  i,fci, . . .  ,lrm»7  satisfy  the  path  criterion  of  Lemma  1. 
Definition  4  T2{ikj,a)  is  the  transformation  due  to  the  addition  of  A^(ikj,  a)  to  the  flow 

matrix,  where  the  locations  i,k,j  are  such  that  dik  —  dkj  =  1- 

Since  the  rectilinear  distance  matrix  is  symmetric,  the  flow  matrix  can  be  assumed  to  be 
symmetric  without  loss  of  generality.  The  resultant  flow  matrix  after  either  of  the  transfor¬ 
mations  can  be  kept  symmetric  by  performing  both  T^ifci  . . .  a)  and  T\{jkm  •  •  •  k\i,a), 
or  T2{ikj,a)  and  r2(jiti,a)  In  the  first  case  the  optimal  objective  value  is  unchanged,  and 
in  the  second  case  it  is  opt(D,J-)  -I-  8a. 

Constraints:  We  first  present  constraints  for  the  first  type  of  transformations.  Formula¬ 
tion  of  constraints  involves  identifying  for  each  location  i,  the  sets  M*  =  {j  I  7  >  »  a,nd  j  is 
involved  in  Tl(ik...j)  for  some  k)  and  >1'  =  {^  |  r*  >  r,  and  k  is  involved  in  Tl(ik...j) 
for  some  7}  such  that  M'f\A'  =  0.  Then  for  each  location  t,  a  single  source  single  sink 
network  flow  graph  is  developed  as  follows.  Node  i  is  the  source  and  each  element  in  the 
set  M'  has  a  directed  edge  to  a  common  sink  z*.  There  are  directed  edges  from  t  to  each 
element  k  in  the  set  A\  For  each  k  €  A'  directed  edges  are  added  from  k  to  each  element 
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in  the  set  /I*  as  follows.  If  c/t  <  c,,  add  directed  edges  from  k  to  elements  in  ^4*  whose 
columns  are  less  than  or  equal  to  cti  if  Ck  >  Ci,  add  directed  edges  to  elements  in  Ak  whose 
columns  are  greater  than  or  equal  to  c/t;  and  if  Ck  =  Ci,  add  edges  to  all  elements  in  Ak- 
This  process  is  continued  recursively  for  each  element  k  £  A'  and  terminates  upon  reaching 
either  the  bottom  left  node  (r,  0),  or  the  bottom  right  node  (r,c),  or  both  depending  on 
whether  Ck  is  less  than,  greater  than,  or  equal  to  c,  respectively.  Similarly,  graphs  are  con¬ 
structed  for  each  location  i.  The  result  is  a  set  of  n  —  1  graphs,  one  for  each  location  (except 
the  n-th)  as  the  source,  and  there  is  no  edge  between  graphs  corresponding  to  different 
locations.  Consider  the  intermediate  nodes  which  are  neither  the  source  nor  the  sink.  For 
these  nodes  the  balance  constraints  (flow-in  equals  flow-out)  form  a  set  of  linear  constraints. 

A  flow  from  a  source  to  a  sink  corresponds  to  a  transformation  as  follows. 

Definition  5  Let  i  be  the  source  and  let  j  G  M'  be  one  of  the  nodes  with  an  edge  to  the 

sink  z'.  Consider  a  positive  flow  of  a  along  the  path  i  —  ki  —  ...  —  km  —  j  —  z'-  The 

transformation  corresponding  to  the  flow  is  Tl{iki  . . .  kmj,ci) 

We  establish  that  the  performance  of  the  transformations  must  satisfy  the  constraints. 
Theorem  1  Let  Si  be  the  set  of  all  feasible  positive  flows  satisfying  balance  constraints, 
and  let  T*  be  the  set  of  all  possible  transformations  of  the  first  type  with  the  sets  A*  and 
M*  defined  as  above.  Every  si  G  S\  corresponds  to  some  subset  of  T*  and  every  subset  of 
T*  corresponds  to  some  si  G  ^i. 

A  set  of  constraints  for  transformations  of  the  second  type  can  be  realized  in  a  similar 
fashion. 

3.  Objective  Function  of  LPLB  We  design  (heuristically)  a  linear  objec¬ 
tive  function  to,  possibly,  tighten  the  Gilmore- Lawlor  bounds  (GLB).  First,  the  rows  and 
columns  of  the  matrix  F  are  permuted  so  that  x/  achieves  the  best  known  objective  func¬ 
tion  value  for  the  QAP.  Intuitively,  this  strategy  should  bring  Fopt  “closer”  to  F  providing 
better  bounds. 

We  start  with  an  initial  optimal  part  F^fl  where  all  entries  equal  a  constant  (  >  0  except 
the  diagonal  which  are  zeroes.  The  residual  part  Ff^fl  is  defined  so  that  F  =  -|- 

For  the  first  type  of  transformations,  we  set  M*  =  {j  1  j  >  «  and  <  0  } 

A'  =  {1;  I  r*  >  r,  and  >  0  }.  In  other  words,  the  transformation  T1  adds  (sub¬ 

tracts)  to  an  entry  of  Fg^fl  if  the  corresponding  entry  in  F,'",'*  is  greater  (lesser)  than  zero. 
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Similarly,  the  sets  are  defined  for  T2.  Note  that  T2  only  adds  to  entries  of  Denote  by 
Xij,  the  total  amount  added  to  or  subtracted  from  due  to  all  the  transformations. 

We  impose  some  additional  constraints  on  Xij  as  follows.  Xij  <  if  j  ^  M',  and 

Xij  <  otherwise.  These  constraints  ensure  that  the  transformations  do  not  produce 

additional  negative  entries  in  the  residual  part  of  the  flow  matrix  F. 

Recall  that  ir;  (the  optimal  permutation  for  Fopt)  also  achieves  the  best  known  objective 
function  value  for  F.  If  x/  can  also  be  established  as  an  optimal  permutation  for  Fret^  its 
optimality  for  F  is  proven.  Though  this  may  not  be  possible  in  all  cases,  better  bounds 
may  be  obtained  in  general  if  lb(D,Fres)  is  close  to  the  value  it  achieves  with  x/.  Let 
dmar  =  r  +  c-2  denote  the  maximum  entry  in  the  distance  matrix,  and  let  dmin  =  1  denote 
the  minimum  entry.  If  x/  were  optimal  for  in  the  evaluation  of  the  objective  function 
each  entry  would  be  multiplied  by  dij.  For  the  unknown  optimal  permutation  let  it 

be  multiplied  by  some  other  distance  matrix  entry  dp,.  The  difference  in  objective  function 
value  due  to  a  single  entry  ij  is  {F^tl%{dij-d^).  If  >  0.  idij-dnin)Xij  is  the  max¬ 

imum  gain  in  lower  bound  due  to  the  entry  ij.  Similarly  if  {Fre»‘l»j  <  0’  (‘^ox  -  dij)Xij  is 
the  maximum  gain  due  to  entry  ij.  Our  objective  function  is  to  maximize  J3r=i  CijXij 
with  the  coefficients  Cij  being  either  d,j  -  dmin  or  dmas  -  dij ,  depending  on  whether 
is  greater  than  or  less  than  zero. 

4.  Computational  Results  Computation  of  a  lower  bound  involves  three 
phases.  In  the  first  phase,  an  LP  is  generated  depending  on  the  initial  optimal  part  and  in 
the  second  phase,  the  LP  is  solved.  In  the  third  phase,  the  optimal  part  is  constructed  from 
the  LP  solution,  its  objective  function  evaluated,  and  a  lower  bound  from  the  literature  is 
applied  to  the  residual  part.  The  first  and  third  phases  of  the  computation  were  done  on  a 
Sun  SPARC  Station  1,  and  for  the  second  phase  the  IBM  3090  version  of  LINDO  was  used. 
Recall  that  prior  to  generating  the  LP,  the  rows  and  columns  of  F  are  permuted  (based  on 
the  best  known  heuristic  solution)  so  that  x/,  now,  is  the  best  known  solution. 

The  constructive  bounding  method  was  tested  on  a  number  of  problems  from  the  literature 
viz.,  Nugem’s  problems  [G]  of  sizes  5-30,  one  problem  of  size  36  due  to  Steinberg  (11)  and 
problems  of  size  42  and  49  due  to  Skorin-Kapov  [10].  The  initial  optimal  part  was  con¬ 
structed  by  choosing  constant  entries  ^  such  that  ^  =  bkv(D,F)/dnim,  where  bkv{D,F) 
denotes  the  best  known  value  for  the  QAP.  Note  that  for  this  optimal  part  the  optimal 
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objective  value  equals  the  best  known  value.  The  LP  corresponding  to  this  optimal  part  is 
generated  and  solved  using  LINDO.  GLB  is  then  applied  to  /ve*-  We  present  the  results  in 
Table  1.  In  the  tabic  BKV  refers  to  the  best  known  value  for  the  QAP  (which  is  optimal 
for  problems  up  to  Nugl5),  and  CGLB  refers  to  the  lower  bound  for  QAP(Z?,F)  obtained 
by  using  GLB  to  obtain  the  lower  bound  for  QAP(£), />«,). 

Among  the  existing  lower  bounds  in  the  literature.  GLB  provides  the  best  bounds  for  Nu¬ 
gent’s  problems  of  size  up  to  8.  However,  for  larger  problems  GLB  does  not  perform  as  well 
as  eigenvalue  based  bounds.  We  consider  two  eigenvalue  based  bounds  from  the  literature: 
MEVB  developed  by  Rendl  and  Wolkowicz  [8],  and  IVB  developed  by  Hadley  et.aJ.  [3]. 
MEVB  provides  better  results  than  IVB  for  Nugent’s  problems  of  size  up  to  30.  However, 
we  do  not  have  resuls  from  MEVB  on  problems  of  size  greater  than  30. 

For  problems  of  size  greater  than  15,  we  also  performed  another  set  of  experiments  by 
varying  the  starting  entry  for  the  optimal  part.  Recall  from  section  4  that  if  the  starting 
entry  ^  is  0,  M'  =  0  and  no  transformations  of  the  first  type  are  possible.  As  f  increases, 
M's  jgrow  in  size  and  A's  shrink  until  A*  =  0  when  ^  equals  the  maximum  element  in  the 
matrix  F.  We  tried  a  few  values  of  ^  using  IVB  to  compute  the  bounds  for  the  residual 
part.  The  results  are  presented  in  Table  2. 

From  the  tables  it  can  be  seen  that  when  our  constructive  method  is  used  as  a  preprocess¬ 
ing  step,  the  bounds  obtained  (CGLB)  are  better  than  GLB  for  all  the  problems  tested. 
Even  for  larger  problems,  where  the  eigenvalue  bounds  seem  to  provide  better  results,  con¬ 
struction  improves  the  bounds.  Table  2  shows  in  bold  the  best  bounds  (CIVB)  obtained 
by  constructing  Fopt  and  evaluating  IVB(Z?, />«)•  CIVB  obtains  better  results  than  both 
MEVB,  where  applicable,  and  IVB  for  all  the  test  problems. 

Eigenvalue  bounds  seem  to  improve  if  the  spectral  radii  of  the  matrices  D  and  F  are 
small  [8].  Though  there  is  no  closed  form  equation  to  evaluate  the  spectral  radius  sp(A) 
of  a  matrix  A,  it  obeys  the  following  inequality  due  to  Mirsky  [5]  sp(A)  <  m(A)  = 
[2  ~  ^ '  "'bere  IrA  denotes  the  trace  of  the  matrix  A.  Performing  trans¬ 

formations  to  minimize  m{Fre,)  can  be  posed  directly  as  a  quadratic  programming  problem 
with  a  convex  objective  function.  Table  2  provides  results  with  a  linear  objective  function, 
and  we  suspect  that  a  quadratic  objective  might  improve  results  even  further. 
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Table  1:  Gsnstructive  GLB 


1  Problem 

BKV 

MEVB(D,  F) 

GLB(D,F) 

OPTfD.Fop,) 

GLB(£),F„.) 

CGLB(£>,F)  1 

Nug6 

86 

70 

82 

92.161 

84 

Bv  1  \ 

214 

174 

186 

248.115 

-42.112 

206 

578 

495 

493 

618.792 

-92.682 

528 

1150 

989 

963 

1191.525 

-149.254 

1044 

etT^^B 

2570 

2229 

2057 

2687.272 

-466.954 

2222 

6124 

5349 

4539 

6194.461 

-974.015 

5222 

Ste36 

9526 

NA 

7124 

10795.224 

-3316.368 

7480 

Sko42 

15812 

NA 

11311 

15836.092 

-2698.571 

13138 

Table  2:  Constructive  IVB 

Problem 

i 

MEVB(D,F) 

IVB(D.F) 

OPT(D,F„p,) 

IVB(£»,Fr„)  CIVB(I»,F) 

Nug20 

3 

2229 

2196 

3480 

-1275.052 

2206 

4 

4562 

-2314.134 

2248 

5 

5700 

-3457.714 

2244 

5349 

5265 

9595 

-4244.326 

5352 

12778 

-7402.730 

5376 

15998 

-10631.381 

5368 

Sko42 

5 

NA 

13830 

37350 

-23241.681 

6 

44772 

-30654.461 

14118 

7 

52234 

-38152.217 

Sko49 

6 

NA 

20716 

65858 

-45018.698 

20840  1 

7 

76838 

-55974.705 

8 

-67007.763 

5.  Conclusions  We  have  proposed  a  new  construction  based  approach  to  obtain¬ 
ing  lower  bounds  for  the  QAP.  Our  approach  is  based  on  performing  optimality  preserving 
transformations  to  decompose  the  QAP  into  two  problems:  one  for  which  an  optimal  solu¬ 
tion  is  known,  and  another  to  which  any  existing  lower  bound  can  be  applied.  This  provides 
a  lower  bound  to  the  original  QAP.  Among  existing  lower  bounds  we  considered  GLB  and 
IVB  in  our  study. 

We  provide  a  set  of  linear  constraints  to  perform  the  transformations.  A  linear  programming 
problem  LPLD  is  formulated  and  solved  to  complete  the  construction.  We  have  improved 
both  GLB  and  IVB  for  all  the  problems  tested.  We  conjecture  that  IVB  may  be  improved 
directly  by  formulating  a  (luadratic  objective  function,  and  solving  the  resulting  optimiza¬ 
tion  problem. 

Though  our  method  is  developed  for  QAPs  with  rectilinear  distance  matrix,  it  can  be  ex¬ 
tended  to  QAPs  with  distance  matrices  satisfying  triangle  inequality.  Since  our  method 
constructs  Fopt  with  known  optimal  solution  such  that  F  =  Fopt  -1-  Frt„  it  has  applications 
to  sensitivity  analysis  and  may  also  be  useful  in  branch  and  bound  techniques. 
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The  major  difficulty  in  mathematical  programming  is  no  longer 
the  solution  of  large  models,  it  is  the  correct  formulation  (or  re¬ 
formulation)  of  the  model.  Large  models  are  now  solved  routinely, 
but  their  very  size  complicates  the  determination  of  how  to  make 
repairs  when  the  model  is  infeasible  or  otherwise  nonfunctional. 
One  u.seful  approach  is  to  localize  or  isolate  the  problem  to  a 
smaller  portion  of  the  whole  model.  This  paper  presents  methods 
and  case  studies  in  the  analysis  of  infeasible  mathematical 
programs  by  isolati.ng  an  Irreducibly  Inconsistent  Set  (IIS)  of 
constraints . 

An  IIS  is  a  set  of  constraints  which  is  infeasible,  but  which 
becomes  feasible  if  any  one  member  is  removed.  The  IIS  may  consist 
of  only  a  few  constraints  when  the  total  constraint  set  is  very 
large.  The  diagnosis  of  the  problem  in  human-understandable  terms 
often  follows  directly  from  examination  of  the  IIS.  At  worst. 
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other  algorithms  or  expert  systems  or  humans  need  only  operate  on 
the  IIS,  typically  a  much  reduced  portion  of  the  entire  model,  to 
arrive  at  a  final  diagnosis.  This  improves  the  overall  efficiency 
of  the  diagnosis  and  repair- process . 

The  paper  presents  the  basic  algorithms  for  IIS  isolation  for 
both  linear  and  nonlinear  programs,  and  their  implementation  in  a 
modified  version  of  MINOS  5.3,  known  as  MINOS(IIS),  developed  at 
Carleton  University.  The  algorithms  are  effective  and  quick  in  the 
linear  case.  The  time  to  find  the  IIS  is  often  a  small  fraction  of 
the  time  to  make  the  initial  determination  that  the  LP  is 
infeasible . 

A  specialized  procedure  for  networks,  which  incorporates  the 
concept  of  nonviability  analysis,  is  also  presented.  Nonviability 
is  a  structural  property  of  a  network  which  a  priori  forces  some  of 
the  arc  flows  to  zero,  before  the  addition  of  flow  bounds  or  extra 
side  constraints.  An  ordered  set  of  tests  of  an  infeasible 
network,  including  nonviability  and  IIS  analysis,  provides  an 
improved  diagnosis .  Unlike  flow-balancing  methods,  the  specialized 
procedure  is  applicable  to  advanced  netforms  such  as  processing 
networks . 

The  analysis  of  infeasible  nonlinear  programs  is  complicated 
by  the  inability  of  nonlinear  optimizers  to  determine  the 
feasibility  of  a  nonlinear  constraint  set  with  100%  accuracy. 
However,  useful  information  can  still  be  extracted  which  can  help 
in  selecting  a  new  initial  point  for  the  optimizer. 

Case  studies  of  analyses  of  infeasible  linear  programs, 
networks,  and  nonlinear  programs  are  presented. 
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1.  INTRODUCTION 

In  this  paper,  we  invesdgate  the  facial  structure  of  the  polytope  whose  extreme  poir*': 
are  exaedy  the  mxp  0-1  block  diagonal  matrices  (m,  p  e  N).  More  precisely,  we  define  a 

matrix  X  to  be  block  diagonal  if  there  exists  a  parddon  R, . R^,  R^^,  of  its  row-set  and 

a  parddon  C,,  ...  ,  Q,  of  its  column-set  such  that  x^j  ^  0  if  and  only  if  i  e  R,  and  j  e 
C,  for  some  1  <  1  <  k  (nodee  that  what  we  really  mean  is  that  X  is  block  diagonal  up  to 
permutadons  of  its  rows  and  columns).  We  let 

S„p  =  {  X  e  {0,1  )”P  I  X  is  block  diagonal  }, 

and  we  denote  by  Q„,p  the  convex  hull  of  S„p.  The  goal  of  this  paper  is  to  provide  a  (parti¬ 
al)  descripdon  of  the  polytope  Q„p  by  linear  inequalides. 

As  explained  in  Crama  and  Oosten  (1992),  our  interest  for  the  polytope  mainly 
stems  from  its  reladon  to  the  cell  formadon  problem  encountered  in  cellular  manufactu¬ 
ring.  The  data  for  this  problem  are  generally  assumed  to  be  summarized  in  the  machine- 
part  incidence  matrix  A,  where  ajj  =  1  if  pan  j  needs  to  be  processed  on  machine  i,  and  a^j 
=  0  otherwise.  Recall  that  a  group  technology  cell  consists  of  a  number  of  machines  (a 
machine-group)  geared  on  the  manufacturing  of  a  number  of  similar  parts  (a  part-family). 
The  cell  formadon  problem  asks  for  a  partition  of  the  machines  into  machine-groups,  a 
partition  of  the  pans  into  part-families,  and  a  matching  between  the  machine-groups  and 
the  part-families  which  optimizes  some  measure  of  the  inter-  and  intra-cell  relationships.  It 
can  be  abstracted  into  the  following  block  diagonalization  problem:  given  an  mxp 
incidence  matrix  A  and  a  function  f(.,.),  find  an  mxp  block  diagonal  incidence  matrix  X 
which  minimizes  f(A,X)  (the  function  f(.,.)  gives  an  estimate  of  the  distance,  or  dissimila¬ 
rity,  between  the  original  incidence  matrix  A  and  the  ’ideal’  cellularized  system  repre¬ 
sented  by  X).  In  Grama  and  Oosten  (1992),  we  showed  that,  for  many  of  the  objective 
functions  f(.,.)  proposed  in  the  literature,  the  cell  formation  problem  can  be  reduced  to  the 
problem  of  minimizing  a  linear  function  of  the  variables  x^j  (i  =  1,  ...,  m;  j  =  !,  ....  p)  over 
the  polytope  Similar  block  diagonalization  problems  also  arise  in  the  analysis  of  large 
data  arrays  (e.g.  for  marketing  or  archeology  applications),  in  production  plaiming  for 
flexible  manufacturing  systems,  in  sparse  matrix  computations,  etc  (see  Crania  and  Oosten 
1992  for  references). 

In  our  presentation,  we  will  often  rely  on  a  graph-theoretic  interpretation  of  block 
diagonal  matrices  and  of  the  polytope  (2„,p.  We  follow  the  graph-theoretic  terminology  of 
Bondy  and  Murty  (1976).  Moreover,  when  B  =  (U,V,E)  is  a  bipartite  graph  and  G  = 
(U,VJ0  is  a  subgraph  of  B,  we  say  that  G  is  a  complete  bipartite  partitioning  of  B  if  all 
connected  components  of  G  are  complete  bipartite  (we  look  at  isolated  vertices  as 
complete  bipartite  graphs).  In  particular,  consider  the  complete  bipartite  graph  =  (U., 
Vp,U„xVp),  where  U„  =  (u„...,u„,)  and  =  {v„...,Vp).  We  regard  an  arbitrary  mxp  0-1 
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matrix  X  as  the  adjacency  matrix  of  a  subgn^h  G  of  K^,  say  G  =  (U.,VpJF)t  where  (Uj.Vj) 
€  F  if  and  only  if  x,j  =  1.  It  is  easy  to  see  that  the  matrix  X  is  block  diagonal  if  and  only 
if  its  associated  graph  is  a  complete  bipartite  partitioning  of 

This  graph'theoredc  interpretation  stresses  the  analogy  of  the  polytope  with  the 
clique  partitioning  polytope  P,  investigated  by  Faigle,  Shrader  and  Suletzki  (1986)  and 
Grdtschel  and  Wakabayashi  (1989),  and  with  the  related  multiway  cut  polytope  studied  by 
Chopra  and  Rao  (1991).  In  fact,  can  be  viewed  as  the  projection  of  P,^,p  on  some 
appropriate  subspaoe.  But  this  observation  does  not  seem  very  useful  in  deriving  a 
description  of  Q^p  from  the  results  available  about  P,. 

In  Section  2,  some  general  properties  of  facet-defining  inequalities  for  the  polytope 
are  stated.  In  Section  3  specific  families  of  facet-inducing  inequalities  are  described. 
Section  4  contains  some  lifting  theorems.  Finally,  in  Section  S,  a  technique  is  presented  to 
patch  facet-defining  inequalities  into  new  valid  inequalities  which,  under  certain  conditi¬ 
ons,  also  define  facets. 


2.  PROPERTIES  OF  FACET-DEFINING  INEQUALITIES 

We  describe  in  this  section  some  general  properties  of  facet-defining  inequalities  for  the 
polytope  Qap  :  two  ’lifting’  results,  relating  facets  of  lower-dimensional  polytopes  to  facets 
of  higher-ctimensional  ones,  and  one  proposition  describing  the  ’graphical’  structure  of 
facet-defining  inequalities. 

In  our  discussion,  it  will  be  often  convenient  to  consider  the  polytope  associated  with 
block  diagonal  submatrices  of  a  given  matrix,  or  equivalently,  with  complete  bipartite 
partitionings  of  a  given  graph.  To  define  these  concepts  more  accurately,  let  B  =  (U„,  Vp, 

E(B))  be  an  arbitrary  bipartite  graph,  where,  as  before,  U„  =  (u, . Ua,}  and  Vp  = 

(v„...,Vp}.  The  set  of  incidence  matrices  of  complete  bipartite  partitionings  of  B  is  denoted 
by  Sb,  and  the  convex  hull  of  Sg  is  denoted  by  Qg.  Clearly,  if  B  =  K„,p,  then  Sb  =  S„,p  and 
Qb  =  Qmp.  In  fact,  the  polyhedron  Qg  can  be  viewed  in  the  space  R”’’  as  the  face  of 
with  the  property  that,  for  all  X  e  Qg,  =  0  when  (ij)  e  E(B). 

The  dimension  of  Qg  is  |E(B)|,  since  the  subgraph  of  B  containing  no  edges  at  all,  as 
well  as  any  subgr:q>h  containing  only  one  edge  of  B,  are  complete  bipartite  partitionings  of 
B.  By  the  same  reasoning,  the  trivial  inequalities  x,  2:  0  and  x,  1  are  facet-defining  for 
Qb,  for  all  e  €  E(B). 

Suppose  now  that  B,  and  Bj  are  two  bipartite  graphs  on  the  same  vertex-set,  and 
differing  only  in  one  edge  (Ui,Vj),  for  example  E(  Bj )  =  E  (  B,)  u  ((Ui,Vj)).  Our  first  result 
follows  directly  fiom  the  sequential  lifting  procedure  described  in  Nemhauser  and  Wolsey 
(1988),  combined  with  the  observation  that  (^,  is  a  facet  of  Qb2  : 

Proposition  1.  Consider  the  valid  inequality  FI  X  ^  tIq  and  assume  that  it  defines  a  facet 
of  Qb,.  Then,  the  inequality  0  X  +  Tt^  x,j  £  71^  defines  a  facet  of  Qgj  iff 

Tty  =  rto  -  max  {  fl  X  I  X  €  Sb2  and  Xy  =  1 ). 

This  proposition  guarantees  that,  when  a  facet-defining  inequality  is  derived  for  a 
’partial’  polyhedron  (^,  this  inequality  can  always  be  lifted  to  a  facet-defining  inequality 
of  The  following  proposition  shows  that  an  inequality  defining  a  facet  of  is  also 
facet-<tefining  for  each  of  the  polyhedra  corresponding  to  block  diagonal  matrices  with  at 
least  m  rows  and  p  columns.  It  is  similar  in  spirit  to  Theorem  3.3  in  Grbtschel  and 
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Wakabayashi  (1990),  Theorem  3.2  in  Ch<^>ra  and  Rao  (1991)  and  Theorem  2.2  in  Deza 
and  Lairont  (1992). 

Proposition  2.  Assume  that  the  inequality  11  X  ^  tV)  defines  a  facet  of  and  let  a,b  e 
M  Then  the  inequality  F  Y  defines  a  facet  of  where  F  e  =  iiy  fw 

i^  and  j<p,  and  Yi|  =  0  otherwise. 

Consider  the  inequality  11  X  ^  and  assume  that  it  defines  a  nontrivial  facet  of  (2b> 
Without  any  infonnation  about  the  numerical  values  of  the  coefficients  (n>ito)f  some 
general  structural  properties  of  the  inequality  can  be  stated.  To  do  this,  associate  with  the 
inequality  two  edge-sets  E  and  E*,  defined  as  follows: 

E  :=  {  (u„v,)  I  (u„Vj)€B  and  *  0},  E*  :=  {  (u„Vj)  I  (u„Vj)eB  and  Jt,j  >  0). 

We  call  the  graph  H  :=  (  V(E),  E  )  (respectively  IF  :=  (  V(E^,  E* ))  the  support  (respecti¬ 
vely  the  positive  support)  of  the  inequality  11 X  ^  n;,. 

Proposition  3.  If  B  is  a  nonempty  bipartite  graph,  and  [1  X  ^  ito  induces  a  facet  of  Qb, 
then:  (1)  rto  >  0; 

(2)  E^  is  nonen^ty; 

(3)  the  support  H  of  11 X  ^  is  connected; 

(4)  the  positive  support  of  11 X  ^  7^,  is  connected; 

(5)  V(E)  =  V(r). 

If  moreover  B  is  a  cort^Iete  bipartite  grtqph,  then: 

(6)  EVE*  is  nonempty,  i.e.  11  has  negative  elements; 

(7)  the  support  H  of  11 X  ^  Jii,  is  two-connected. 


3.  FACET-DEFINING  INEQUALITIES 

We  present  in  this  section  various  classes  of  facet-defiiung  inequalities  for  These 
inequalities  will  be  obtained  by  lifting  facet-defiiting  inequalities  for  a  face  of  0-p 
(according  to  Proposition  1).  Some  of  the  subgraphs  B  which  we  will  consider  are 
’squares’  (i.e.  C4’s),  so-called  ’spiked’  C4-free  connected  bipartite  graphs  and  cycles. 

Fot  a  given  X  €  {0,1  )"*  and  a  given  subgraph  B  of  we  use  the  shorthand  x(B)  to 
denote  the  sum  £(  I  (u„Vj)  €  E(B)). 

3.L  Square  inequalities 

C!rama  and  Oosten  (1992)  observed  that  the  square  inequalities  : 

jq, XjB -H  x,j  -  X*  <;  2  (h4e  (1 

t****  m),j3c€  (l,...,p)) 

are  valid  for  (hence,  for  its  convex  hull  0,^).  and  that  they  yield,  together  with  the 
integrality  constraints  on  X,  a  valid  description  of  S^;  that  is, 

=  (  X  e  (0,1)"'  ix^  +  XBB-t-Xy-x^^2  f(X‘allh,ie  {l,...,m)  and  jdt  e  {L— tP)  )• 
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In  fact,  it  is  easy  to  see  that  the  square  inequalities  are  facet-defining  for  Q22,  and  hence 
we  deduce  from  Proposidon  2  that  they  are  facet-defining  for  for  all  m,p  ^  2. 

3.2.  Facet-defining  inequalities  based  on  spiked  C4-free  connected  bipartite  graphs 

The  term  ’spike’  refers  to  special  edges  of  the  positive  support  H'’  of  a  valid  inequality, 
say  n  X  ^  or  to  the  conesponding  coefficients  of  11.  A  spike-leaf  of  or  of  FI  X  ^ 
tIq,  is  a  vettex  covered  by  exactly  one  edge  of  the  subgrtq)h  IT.  That  covering  edge  is 
called  a  spike.  A  spike-root  is  a  vertex  covered  by  a  spike,  but  not  a  spike-leaf  itself.  For 
example  the  square  inequality  x,^  +  x,^-fxjj-x^<2  has  two  spike  leaves  (vertices  Uj  and 
v^,  two  spikes  (the  edges  (14. V|)  and  (Uj.Vj))  and  two  spike  roots  (vertices  u,,  and  Vj). 

Notice  that  there  are  facet-defining  inequalities  whose  support  consists  of  exactly  one 
spike,  namely  the  trivial  inequalities  x^j  ^  1.  It  follows  from  Proposition  3(3)  in  Propositi¬ 
on  3  that  nontrivial  facet-defining  inequalities  never  contain  spikes  covering  two  spike- 
leaves. 

We  say  that  a  graph  B  is  spiked  if  each  vertex  of  B  is  covered  by  exactly  one  spike.  We 
say  that  B  is  a  C4-free  graph  if  it  does  not  containany  cycles  of  length  four  (i.e.,  €«’$). 
The  following  holds: 

Proposition  4.  For  k  c  N,  if  B  is  a  spiked  C4-free  connected  bipartite  graph  with  exactly  k 
spikes,  then  the  inequality  x(B)  <  k  defines  a  facet  of  (^. 

In  view  of  Propositions  1  and  2,  the  inequality  x(B)  ^  k  defined  in  Proposition  4  can  be 
lifted  to  a  famUy  of  facet-defining  inequalities  of  for  all  m,p  k.  A  subset  of  this 
family  can  be  described  explicitly.  To  achieve  this,  a  new  definition  is  needed:  a  subset  C 
of  (U.xVp)\E(B)  is  called  a  chord  set  for  a  spiked  tree  B  if,  for  each  path  between  two 
spike-leaves  of  B,  there  is  an  edge  in  C  linking  two  (arbitrary)  vertices  of  the  path.  The 
following  proposition  holds: 

Proposition  5.  For  k  €  N  and  m,p  ^  k,  if  B  is  a  spiked  tree  with  exactly  k  spikes  and  C  is 
a  minimal  chord  set  for  B,  then  the  inequality  x(B)  -  x(C)  <  k  defines  a  facet  of  Q^p. 

To  get  better  acquainted  with  these  spiked  tree  inequalities,  consider  for  instance  the 
special  case  in  which  the  spike-roots  U2,  u,,...,  u^  of  the  spiked  tree  B  are  all  adjacent  to 
the  spike-root  v„  as  shown  in  figure  1  below.  A  minimal  chord  set  C  for  this  tree  must 
consist  of  the  following  edges  :  for  all  ij  2:  2,  (u,,Vj),  and  either  (Ui,Vj)  or  (Uj,Vi).  Carrying 
out  this  construction  with  k  =  1  ot  k  »  2  denxmstrates  that  the  trivial  inequalities  x^j  ^  1 
and  the  square  inequalities  belong  to  the  family  of  spiked  tree  inequalities. 

Another  subset  of  the  family  of  facet-defining  inequalities  bas^  on  C4-free  connected 
bipartite  graphs  can  also  be  described  explicitly  as  follows.  Let  a  spiked  cycle  be  a  spiked 
graph  whose  spike  roots  induce  a  cycle  (notice  that  this  is  a  slight  abuse  of  our  general 
definition  of  a  spiked  graph,  since  a  spiked  cycle  is  not  a  cycle;  but  this  abuse  is  conve¬ 
nient,  and  will  not  cause  any  confusion).  A  subset  C  of  (U„xVp)\E(B)  is  a  chord  set  for 
the  spiked  cycle  B  if,  for  each  pair  (s,t)  of  spike-leaves  of  B,  and  for  each  of  the  two 
paths  P|  (i-1,2)  between  s  and  t,  there  is  an  edge  e(s,tj’,)  e  C  such  that : 

(a)  e(s,t,P,)  links  two  (arbitrary)  vertices  of  P,  (i=l,2); 

(b)  e(s,t4*i)  and  e(s,t,P^  are  distinct; 

(c)  if  one  of  to  the  leaves  s  and  t  is  covered  by  both  edges  e(s,t,P,)  and  e(s,tj’2),  then  one 
of  these  edges  covers  s  and  t. 
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Proposition  6.  For  all  m,p  ^  k  ^  6,  if  B  is  a  spiked  cycle  with  exactly  k  spikes  and  C  is  a 
minimal  chord  set  for  B,  then  the  inequality  x(B)  -  x(C)  ^  k  defines  a  facet  of 


The  smallest  example  of  a  C4-free  spiked  cycle  is  shown  in  Figure  2.  Call  this  graph  B^. 


Uj  Vj 


It  is  easy  to  see  that  all  of  the  edges  (u2,Vi),  (U2,v,),  (U(,v,),  (U4.V,),  (u„v,)  and  (Uf,V|) 
must  be  in  any  chord  set  for  B^;,  but  on  the  other  hand,  there  exist  various  ways  to 
complete  this  list  to  a  minimal  chord  set  As  a  matter  of  fact  it  can  be  checked  that  each 
of  the  matrices  IT'.  IT  and  IT  hereunder  gives  rise  to  a  facet-defining  inequali^  of  die 
form  n  X  ^  6,  derived  from  B,  as  explained  in  Proposition  6: 
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The  question  arises  wether  it  is  possible  to  describe  explicitly  other,  possibly  more 
general,  subfamilies  of  facet-defining  inequalities  based  on  C4-fiee  connected  bipartite 
graphs.  In  Section  S,  we  present  a  patching  procedure  which  pardally  answer  this  question. 

Some  interesting  variants  of  the  spiked  tree  and  spiked  cycle  inequalities  can  be  gene¬ 
rated  by  adding  a  single  special  vertex  to  B.  This  yields  facet-defining  inequalities  whose 
positive  support  is  not  spiked.  Details  are  omitted  from  this  extended  abstract. 

Let  us  finally  observe  that  the  incidence  graphs  of  projective  planes  are  very  special 
(non-spiked)  Q-free  connected  bipartite  graphs,  which  also  give  rise  to  interesting  valid 
inequalities  for  Details  are  again  omitted. 

3.3.  Facet-defining  inequalities  based  on  cycles 

Let  Q  be  a  cycle  of  length  k,  with  k  even.  If  k  S  6,  then  no  component  of  a  complete 
bipartite  partitioning  of  Q  can  contain  more  than  two  edges.  Therefore,  the  total  number 
of  edges  of  a  complete  bipartite  partitioning  of  Q  cannot  exceed  %  k.  If  k  is  not  a 
multiple  of  three,  then  the  inequality  x(Q)  <  [%  kj  is  facet-defining  for  Qq, 

Define  now  3Q  to  be  the  graph  induced  by  the  three-chords  of  Q  (a  thiee-chottl  of  Q  is 
an  edge  joining  two  vertices  at  distance  3  in  Q).  Then  the  following  holds: 

Proposition  7.  For  all  k  ^  4  and  all  m,  p  with  k  ^  2min(m,p),  the  inequality 

x(C,)  -  x(3Ck)  <  [%  kJ  is  valid  for  cLp.  If  k  =  1  (mod3),  then  the  inequality  induces  a 

facet  of  0^. 


4.  LIFTING  THEOREMS 

Let  Vj  e  Vp.  The  covering  Cn(Vj)  (also  denoted  as  c(vp,  when  no  confusion  can  arise)  of 
a  vertex  Vj  with  respect  to  the  valid  inequality  fl  X  <  Ji,  for  (2,^  is  defined  as  follows: 

Cn(Vj)  •=  ^  {  n  X  I  X  e  S„p,  and  Xy  =  0  for  all  i  =  I,  2 . m  ). 

(A  similar  definition  would  of  course  apply  to  any  vertex  Uj  e  Ua).  The  covering  of  an 
arbitrary  vertex  is  always  nonnegative.  The  covering  of  Vj  is  zero  if  tty  =  0  for  all  i  =  1,2, 
....  m,  i.c.  if  Vj  is  not  covered  by  any  edge  in  the  support  of  the  inequdity. 

A  tight  inequality  is  a  valid  inequality  for  (^p  with  the  property  that  there  exists  a 
complete  bipartite  partitioning  X  satisfying  the  inequality  with  equality,  and  such  that  all 
vertices  having  a  strictly  positive  covering  with  respea  to  the  inequality  are  in  the  same 
connected  component  of  X  (it  can  be  checked  that  all  square  inequalities  are  tight). 

Finally,  call  U-extension  of  the  graph  B  obtained  from  by  adding  a  vertex  u,,^, 
to  U„  a  vertex  Vp^,  to  Vp,  and  all  the  edges  between  u,^,  and  VpUfVp^,};  that  is, 
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B  :=  (  U.u{u^,),  VpU(v^,},  (U,xVpM{»Wi)xVp)u(u^„Vp^,) ). 


The  following  statement  describes  how  a  tight  facet-defining  inequality  for  can  be 
lifted  to  a  facet-defining  inequality  fix'  Q^: 

Proposition  8.  Let  11  X  ^  be  a  light  facet-defining  inequality  for  O^.  Let  the  inequality 
r  Y  ^  Yg  be  constructed  in  the  following  way; 


if  (u„Vj)  e  U.xVp; 


Tij  = 


•  Cn(vj) 

Cn(Vk) 


p 

To  ~  ^  Cn(Vk). 


if  i  =  m+1  and  Vj  e  V^; 
if  i  =  m+1  and  j  =  p+1; 


Then  the  inequality  F  Y  ^  Yo  defines  a  facet  of  Qat  where  B  is  the  U-extension  of  K,,,. 

Proposition  8,  together  with  Propositions  1  and  2,  implies  that  the  inequality  TY  ^Yo 
can  in  turn  be  lifted  to  a  family  of  facet-defining  inequalities  for  (n  ^  m,  q  ^  p).  A 
special  subset  of  such  inequalities,  which  we  call  ’totally  spiked  tight  inequalities’,  can  be 
described  explicitly.  A  totally  spiked  inequality  is  an  inequtdity  FI  X  ^  itg  whose  support  is 
spiked,  and  such  that  the  spiked  solution  S,  defined  by  s^  -  1  if  and  only  if  (U(,Vj)  is  a 
spike,  satisfies  11  S  «  itg.  A  simple  example  of  totally  spiked  inequality  is  again  provided 
by  any  square  inequality,  or  by  any  of  the  facet-defining  inequalities  described  in  Section 
3.2.  Now,  our  next  proposition  allows  to  lift  totally  spiked,  tight,  facet-defining  inequali¬ 
ties  for  to  totally  spiked,  tight,  facet-defining  inequalities  for  (of  course,  a  simi¬ 
lar  result  holds  for  Q^i).  When  stating  this  result,  we  assume  that  the  spikes,  the  spike- 
roots  and  the  spike-leaves  of  the  inequality  are  numbered  in  such  a  way  that  spike-root  i 
and  spike-leaf  i  are  covered  by  spike  i. 

Proposition  9.  Let  11  X  ^  7^  be  a  totally  spiked  tight  inequality  defining  a  facet  of  Q„p. 
Let  Yo  and  Yj  be  defined  as  in  Proposition  8  for  all  edges  (Ui,Vj)  of  the  U-extension  of 
and  let  Yy  =  2^i.^lb.i  [  Jt*  1*  if  u,  e  U,  and  j  =  p+1.  Then,  the  inequality  F  Y  ^  Yo  ‘s  a 
totally  spiked  tight  inequality  defining  a  facet  of 

5.  PATCHING  FACET-DEFINING  INEQUALITIES 

Sometimes,  families  of  valid  ii^ualities,  or  even  facet-defining  inequalities,  can  be 
constructed  by  combining  together  a  number  of  other  valid  inequalities.  This  section 
presents  such  a  patching  procedure.  For  simplicity,  we  assume  that  only  two  valid 
inequalities  for  are  to  be  combined,  say  AX  ^  Sg  and  BX  £  bg.  We  denote  by  A  and 
B,  respectively,  the  vertex-sets  of  the  suppmts  of  these  two  inequalities;  A  and  B  are 
assumed  to  be  disjoint  We  define  the  neighborhood  of  a  vertex  i  with  respect  to  A  to  be 
the  set  NaCO  :-  (  j  :  a,j  >  0  j  (similariy  for  B).  Notice  that  a  vertex  is  not  in  its  own 
neighborhood. 

Now  the  patching  procedure  can  be  roughly  sketched  as  follows.  It  makes  use  of  the 
concept  of  covering  introduced  in  Section  4.  First,  vertices  having  a  strictly  positive 
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coveiing  with  respect  to  AX  ^  a<,  are  selected  in  A,  in  such  a  way  that  their  neighb(»faoods 
with  respect  to  A  are  disjoint;  an  equal  number  of  vertices  are  selected  in  A  in  a  similar 
way.  Then,  the  selected  vertices  from  A  and  B  are  matched.  The  graph  ftH*  which  a 
valid  inequality  T  Y  ^  Yo  will  be  derived,  is  the  complete  bipartite  graph  induced  by  the 
union  of  the  vertex-sets  A  and  B.  Construct  now  the  coefficients  of  the  inequality  P  Y  ^ 
Yo  as  follows: 


y>i=  1 


ajj  if  Uj  €  A  and  Vj  €  A  , 

by  if  u,  e  B  and  Vj  e  B  , 

{  CgfUj),  Ca(Vj))  if  (Uj,Vj)  is  a  matched  pair,  Uj  e  B  and  v^  e  A  , 

min  {  c^fUi),  Cb(Vj))  if  (Ui,Vj)  is  a  matched  pair,  u^  e  A  and  Vj  e  B  , 

-  Yui  if  (Uh.Vk)  is  a  matched  pair,  Uj  6  Na(ub)  and  Vj  e  Nafv^), 

or  u,  6  Nb(uJ  and  Vj  e  N^fv^), 


0  otherwise. 


Cq  —  a^  +  bo. 


Proposition  10.  The  inequality  T  Y  <  Yo  is  a  valid  inequality  for 

Proposition  11.  If  AX  <  ao  and  BX  <  bg  are  valid  inequalities  for  Q_p  which  have  been 
obtained  by  patching  together  a  number  of  spiked  tree  and  spiked  cycle  inequalities,  then 
the  inequality  F  Y  <  Yq  obtained  by  patching  AX  ^  ao  and  BX  <  bg  as  explained  above  is 
facet-defining  for  Q„p. 
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1.  Introduction 

For  more  than  a  decade  already,  modeling,  methodological  research  and  software 
development  for  stand-alone  decision  support  systems  have  belonged  to  the  scope  of 
our  department.  Both  on  mainfrrimes  and  PC’s,  we  have  had  several  joint  project 
with  the  Hungarian  Electricity  Board  on  electrical  energy  optimization  problems, 
projects  on  inventory  and  production  control  have  been  carried  out  in  the  steel 
industry,  smaller  special  applications,  for  example  menu  planning  for  hospitals,  op¬ 
timal  design  of  trusses  for  a  bus  manufacturing  company  must  also  be  mentioned 
among  the  succesful  applications. 

In  our  depeirtment,  research  and  softw2ure  development  on  group  decision  support 
begzm  in  1989  only,  by  a  small  team  (Ij.  The  reason  for  the  increasing  interest 
for  such  systems  is  quit  simple:  in  today’s  organizations  decisions  are  made  mostly 
collectively.  As  managers  spend  more  of  their  time  in  meetings,  the  study  of  infor¬ 
mation  technology  to  support  meetings  becomes  increasingly  important. 

Several  type  of  group  support  systems  have  been  developed  by  the  Group  Sup¬ 
port  Systems  research  community,  varying  from  collaborate  writing  to  computer 
supported  negotiation  and  decision  making.  A  Group  Support  System  can  support 
meetings,  which  are  distributed  geographically  and  temporally.  Tasks  in  a  group 
decision  situation  include  communication,  planning,  idea  generation,  problem  analy¬ 
sis  and  design,  problem  solving,  negotiation,  conflict  resolution,  collaborative  docu¬ 
ment  preparation.  Group  supf>ort  systems  should  provide  the  sheiring  of  information 
among  group  members  and  between  the  group  and  computer.  Decision  makers  may 
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get  individually  all  the  necessary  information  they  need,  and,  in  some  extent,  they 
can  carry  out  their  task-dependent  decision  process  also  individually.  The  group 
support  system  should  help  in  defining  and  formalizing  the  decision  problem  for  the 
group.  The  system  should  provide  the  necessary  data,  tools  and  methods  for  solving 
the  specific  decision  task  individually.  Last  but  not  least,  the  system  should  help  to 
achive  the  result  satisfactory  for  all  group  members. 

The  basic  concept  for  us  has  been  to  develope  a  rather  flexible  framework,  which 

-  has  2m  attractive  user  interface, 

-  is  adjustable  to  different  type  of  group  decision  situation, 

-  is  able  to  integrate  the  knowledge  and  experiences,  accumulated  over  the  last 
decade  in  our  department  on  stand-alone  decision  support  system  design  and  devel¬ 
opment. 

Within  three  years,  a  PC  based  system  working  in  the  MS  WINDOWS  envi¬ 
ronment  has  been  realized.  At  present,  we  are  like  conducting  a  mission  with  our 
WINGDSS  system  in  the  really  difficult  process  of  convincing  people  to  use  com¬ 
puters  for  supporting  their  group  decision  problems,  but  the  real  life  applications  of 
WINGDSS  should  convince  its  possible  users  about  its  higher  efficiency. 

WINGDSS  has  already  been  proved  very  helpful  in  evaluating  bids  for  tenders, 
for  example  at  the  Tender  Bureau  of  the  Hungarian  Telecommunication  Company. 
We  developed  a  model  for  appr2usal  of  hotels  for  the  State  Property  Agency.  .A.t 
the  Ministry  of  Welfare,  the  purpose  of  the  usage  of  WINGDSS  is  to  support  bud¬ 
get  allocation  processes  for  social  institutions.  \Ve  are  working  on  extending  the 
applicability  of  our  system  for  more  complex  problems,  for  example  in  environmen¬ 
tal  impact  analysis  problems.  Our  experiences  collected  with  real  life  applications 
define  new  directions  for  further  developments  in  WINGDSS. 

2.  What  kind  of  group  decision  processes 
ARE  SUPPORTED  BY  WINGDSS  ? 

The  decision  problem  can  be  typifed  as  follows: 

A  group  of  experts  from  different  fields  but  with  a  common  interest  has  the  t2isk  of 
ranking  certain  alternatives  characterized  by  a  finite  set  of  properties  or  attributes. 
Attributes  C2m  be  factual  data  and  subjective  factors.  Applying  a  proper  utility 
function  to  the  set  of  alternatives  leads  to  a  ranking  of  the  alternatives  Jiccord- 
ing  to  their  numerical  values.  The  individu2d  ranking  will  reflect  the  individual 
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preferences,  group  ranking,  in  addition,  will  incorporate  the  differences  of  priorities 
and  the  expertise  of  decision  makers.  The  arrival  to  a  group  ranking  satisfactory 
to  all  members  is  supported  by  a  series  of  possibilities  for  the  interau:tive  usage  of 
WINGDSS.  The  system  provide  userfriendly  tools  for  a  lot  of  operations  that  can 
be  carried  out  in  the  decision  process  during  program  execution,  on-screen,  by  the 
users  themselves.  Practically,  feedbacks  from  the  individuals  can  be  integrated  at 
any  stage  of  the  decision  process.  The  system  is  always  reswly  for  updates. 

In  the  past,  stand-alone  PC-s  were  more  frequently  used  in  Htmgzu-y  than  LANs 
and  workstations,  this  is  why  the  present  version  works  on  a  single  PC.  In  spite 
of  that,  the  system  provides  the  athmosphere  of  a  decision  room  with  networked 
computers:  task  formulation,  idea  generation,  and  team  building  is  supported  in 
many  ways,  but  at  the  same  time,  the  individuzJs’  privacy  is  ensured  as  well. 

The  group  decision  process  is  concerned  as  a  ihree-pktue  event-. 

-  the  preparation  of  the  decision  task, 

-  the  process  of  individual  evaluation, 

-  the  phase  of  aggregation  (group  result  processing). 

This  concept  defined  three  main  menu  groups  for  a  virtual  separation  of  the 
zu:tivities,  however,  they  do  not  describe  the  sequence  of  actions  obligatory:  moving 
bau:k  and  forth  among  the  different  phases  is  possible  at  any  stage. 

2.1  Task  preparation  phase 

The  key  problem  is  trainsforming  the  actual  decision  task  into  am  appropriate  form. 
Idea  organization  is  one  of  the  main  issues  in  a  decision  process.  The  hieraurchy  of 
criteria  is  a  tree  in  our  wingdss:  one  staurts  with  the  most  general  criterion,  which 
corresponds  to  the  root  of  the  tree,  and  grauiually  decomposes  it  to  more  specific 
criteria.  The  leaves  represent  the  criteria,  which  cam  be  evaluated  independently 
from  each  other.  Some  decision  problems  can  be  represented  with  a  tree  of  several 
levels,  while  others  ase  less  decomposable.  In  the  earlier  versions  of  WINGDSS, 
vauiables  defined  at  one  leaf  criterion  were  not  reachable  at  the  other  leaves.  The 
third  version  eliminated  this  drawback  by  sepau-ating  the  definition  and  storing  of 
the  variables  from  the  definition  amd  storing  of  the  tree-components. 

Creating  and  modifying  the  criterion  tree  with  on  screen  operations  is  technically 
possible  due  to  a  modul,  which  is  applicable  to  graph  handling  tasks  separately 
from  the  WINGDSS  system  as  well:  nodes  and  subtrees  can  be  constructed,  moved, 
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copied,  deleted,  renamed  and  arranged  in  several  ways. 

The  data  of  alternatives  can  be  typed  in  directly  or  they  can  also  be  selected  from 
zm  outer  database.  Any  database  handling  system,  running  under  MS  Windows,  can 
be  fitted  to  WINGDSS.  A  methodology  for  selecting  records  from  the  outer  database 
can  also  be  defined  from  the  WINGDSS,  providing  a  screening  on  the  alternatives. 

The  evaluation  of  the  alternatives  starts  at  the  lead  criteria,  with  functions  defined 
exlusively  to  the  actual  decision  task.  Finding  the  appropriate  functions  and  /  or 
procedures  is  also  a  key  problem  in  decision  support.  Thanks  to  an  interpreter  built 
into  the  WINGDSS,  the  functions  can  also  be  created  or  modified  on  screen  by  any 
authorized  individual.  The  system  offers  a  collection  of  ready  made  functions  as 
well.  Version  3.0  has  already  the  capability  of  integrating  program  solvers. 

Once  the  problem  has  been  set  up  in  the  necessary  form,  the  next  steps  are, 

-  for  each  decision  maker  (DM),  to  assign  weights  to  each  criteria  reflecting  their 
importance, 

-  to  assign  weights  -  voting  powers  to  each  DM  at  each  criteria,  expressing  the 
DM’s  competency  in  evaluating  the  criteria. 

We  assume  the  presence  of  a  system  facilitator  or  supervisor,  who,  with  on  screen 
operations,  composes  the  decision  group,  determines  the  individuals’  authorities, 
and  assigns  the  voting  powers.  The  authorities  include  the  right  to  construct  and 
modify  the  decision  task  (criteria,  alternatives,  evaluation  procedures),  the  right  for 
participating  in  the  individual  and  in  the  group  decision  process. 

2.2  Individual  decision  phase 

Criteria  are  factual  or  subjective  data.  The  functions  or  procedures  defined  at  the 
leaf  criteria  must  be  identical  for  all  group  members,  but  the  values  of  these  func¬ 
tions  are  equal  on  factual  data  only.  The  result  of  the  decision  makers’  individual 
evaluation  will  still  vary  due  to  the  diversity  of  preferences.  The  weights  expressing 
the  preferences  of  a  DM  should  be  given  explicitly,  but  we  plan  to  integrate  methods 
for  this  process.  The  function  values  on  the  subjective  criteria  axe  very  likely  de¬ 
pendent  on  the  experts’  opinion,  and  the  individual  preferences  will  modulate  these 
differences  further.  The  final  score  of  zin  alternative  during  the  individual  decision 
process  will  be  C2dculated  as  the  weighted  average  of  the  function  values,  starting 
at  the  leaf  criteria,  combining  them  with  the  weights,  and  then  proceeding  toward 
the  root  of  the  criteria  tree.  The  mathematical  formulation  is  relativly  simple: 
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Consider  a  decision  problem  with  I  group  members  D\...Di^  n  alternatives 
Ai  ...An  and  m  criteria  Ci  ...  • 

Denote  the  result  of  the  individual  evaluation  of  decision  maker  Dk  for  alternative 
on  each  leaf  criterion  Ci  by.  Assume  that  the  problem  arising  from  the 
differences  in  dimension  of  the  attributes  has  already  been  settled. 

Let  tof  >  0  weight  assigned  by  Dk  to  Ci,  t  =  1 . . .  m  at  each  branching  of  the 
tree. 

The  calculation  starts  at  each  simple  subtree  (denoted  by  N')  consisting  from  leaf 
criteria  and  their  father,  by  the  formula 


'Lies- 


j  =  1 . . .  n, 


(1) 


The  value  is  assigned  to  the  root  of  this  simple  subtree.  The  calculation 
proceeds  toward  the  root  of  the  criterion  tree  with  combining  the  weights  on  the 
higher  level  criteria  with  values  resulted  from  one  level  below.  The  individual  utility 
given  by  Dk  for  Aj  will  be  assigned  to  the  root. 

Note  that  an  additive  multiattribute  model  is  only  applicable  to  the  decision 
problems  when  the  additive  independence  of  the  criteria  can  be  proved  [4],  [5]. 


2.S  Group  ranking  phase 

For  objective  attributes  only  the  weights  given  by  a  decision  maker  will  be  revised 
(at  each  criterion)  by  the  voting  power  for  weighing.  However,  in  case  of  subjective 
attributes,  not  only  the  weights  but  also  the  evaluation  itself  (the  af  ^  values)  will 
be  modified  at  the  corresponding  leaf  criteria  by  the  voting  power  for  qualifying, 
where 

^^(tx;)*  is  the  voting  power  assigned  for  Dk  for  the  DM’s  weighing  on  a  criterion 
Ci,  and 

V{q)j  is  the  voting  power  assigned  for  Dk  for  the  DM’s  qualifying  on  a  subjective 
leaf  criterion  Ci. 

Now  the  method  of  calculating  the  group  utility  of  the  alternative  Aj  is  carried  out 
on  the  tree  of  criteria,  basically  in  the  same  way  as  it  has  been  done  by  calculating 
the  individual  utilities. 

First  we  aggregate  the  weights  at  each  node  i  and  get  the  group  weights: 
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r  ^  ELi 

*  eLi  ’ 


I  =  1 . . .  m. 


(2) 


Then  we  compute  the  aggregated  qualification  at  each  leaf  criterion  C,-  and  get 
the  group  qualification  at  the  leaves,  for  each  alternative  Aj: 


Qi,i 


ELi 


t  €  N',  j  =  l...n. 


(3) 


The  group  utility  of  Ay  is  the  result  of  the  linear  combination  of  the  aggregated 
qualification  values  with  the  aggregated  weights  (proceeding  from  the  leaf  level  to¬ 
ward  the  root): 


E. 

E.w^^  ’ 


j  =  1 . . .  n 


(4) 


A  correct  group  utility  function  must  satisfy  the  axioms  given  in  (6].  The  function 
(4)  appearently  used  in  WINGDSS  is  appropriate  in  this  respect. 

The  third  main  menu  group  provides  various  possibilities  to  compare  the  decision 
makers’  individual  weighing  and  evaluation.  The  opinions  of  other  group  members 
will  often  cause  one  member  to  reconsider  and  modify  his  evaluation.  Such  feedbacks 
can  be  realized  by  WINGDSS:  any  decision  maker  is  allowed  to  activate  the  appro¬ 
priate  menu  agedn  for  performing  modifications  in  the  evaluation  of  the  subjective 
criteria  or  for  changing  his/her  preference  structure  (the  individual  weights  on  the 
criteria).  Changes  in  the  structure  of  the  decison  task,  and  /or  in  the  function  at 
the  leaf  criteria  can  be  performed  by  the  Supervisor  or  any  authorized  user. 


2.4  Sensitivity  analysis  of  the  result 

Analysis  the  impact  of  certain  decision  peirameters  (individual  preference  struc¬ 
ture,  voting  power  of  decision  makers)  to  the  final  result  can  be  performed  with  a 
method  developed  separately  [8],  integrated  recently  into  Version  3.0.  The  algorithm 
can  be  used  for  different  purposes: 

-  What  are  the  intervals  in  which  the  weights  can  vary  without  effecting  the 
ranking  of  the  alternatives? 

-  If  the  weights  au-e  allowed  to  vary  in  given  intervals,  how  the  value  and  position 
of  the  alternatives  will  be  changed? 
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-  What  kind  of  transformations  are  needed  to  change  the  position  of  one  particular 
alternative  (to  make  one  low  ranked  alternative  acceptable,  for  example)? 

-  If  group  members  will  agree  in  the  ranking  of  a  subset  of  the  alternatives  (the 
top  set  of  one,  two,  three,  fourr-.. . .  alternatives),  what  changes  are  required  in  the 
weights? 


3.  Technical  data 


Distributional  format: 

One  floppy  disk  of  740  KB  or  higher. 

Hardware  requirements: 

IBM-PC/AT  386  or  486,  VGA  card,  mouse. 

Operating  system: 

PC-DOS  3.3  or  higher. 

Software  requirement: 

MS  Windows  operating  environment  version  3.0  or  higher. 

How  THE  AUTHORS  DID  CONTRIBUTE  TO  WINGDSS? 

Tamas  Rapcsak  and  Piroska  Turchanyi  are  the  present  managers  of  the  Group 
Decision  Project,  together  with  Kriszta  Keller,  they  also  carry  out  research  and 
modelling.  The  tree  handling  modul  and  the  handy  interface  for  data  input  has 
been  developed  by  Peter  Csaki.  Levente  Csiszar  has  been  working  on  the  interface 
to  program  solvers,  on  the  interpreter  for  the  criteria  evaluating  functions,  on  the 
methods  both  in  the  individual  and  group  evaluation  phase.  Ferenc  Fdlsz  has  been 
responsible  for  any  kind  of  data  base  functionalities.  The  sensitivity  algorithm  is 
from  Csaba  Meszaros. 
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Global  optimal  solutions  with  tolerances 
and  practical  composite  laminate  design 

Tibor  Csendes*  Zelda  B.  ZabinskyJ  Birna  P.  Kristinsdottir^ 


Consider  the  nonlinear  optimization  problem 

min/(x)  (1) 

where  /(z)  :  12"  — » 12  is  a  continuous  nonlinear  function,  and  the  variables 
are  subjects  of  the  constraints 

y,(a:)<0  y  =  (2) 

where  yy(z)  :  12"  — »  JR  are  also  continuous  functions.  Let  us  denote  the 
set  of  feasible  points  by  A,  that  is  A  {x  €  12"  :  gj{x)  <  0  for  each 

J  ^  (1| 2, . • • ,  *”)}• 

It  happens  many  times  that  the  solution  z*  (or  an  approximation  of  it) 
of  a  constrained  nonlinear  optimization  problem  is  known,  yet  this  result 
is  not  suitable  for  practical  use.  It  is  the  case  when  the  solution  should 
be  realized  with  a  certain  tolerance  d  >  0.  If,  moreover,  at  least  one  of 
the  constraints  is  active  at  the  solution,  then  the  n>dimensional  interval 
[zj  —  S,  Xf  +  ^1  for  t  =  1, 2, . . . ,  n  is  not  entirely  feasible  (cf.  [5]  and  (8]). 

FYom  practical  point  of  view,  it  would  be  better  to  have  a  suboptimal 
solution  in  the  form  of  an  n-dimensional  interval  X*  (i.e.  for  which  /(z)  < 
/(z*)  +  e  and  j^y(z)  <  0  j  €  (1, 2, . . . ,  m),  where  z  G  X*,  and  e  >  0).  Such 
a  result  interval  would  also  reflect  the  sensitivity  of  the  objective  function 
for  changes  in  the  arguments  on  the  set  of  feasible  points. 


*Kalmir  Laboratory,  J6ia«f  AttUa  Univeraity,  Sieged,  Hungary 
^Induitrial  Engineering  Program,  University  of  Washington,  Seattle,  USA 
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In  contrast  to  interval  optimization  methods  like  [2],  here  an  algorithm 
for  finding  a  large  feasible  n-dimensional  interval  for  constrained  global  op¬ 
timization  is  presented.  The  resultant  interval  b  iteratively  enlarged  about 
a  seed  point  while  maintaining  feasibility.  An  interval  subdivision  method 
is  used  to  check  feasibility  of  the  growing  box.  The  algorithm  utilises  the 
inclusion  functions  [l,4,7j  of  the  objective  and  constrain  functions.  These 
are  calculated  by  natural  interval  extension.  The  resultant  feasible  interval 
is  constrained  to  lie  within  a  given  level  set,  thus  ensuring  it  is  close  to  the 
optimum.  It  is  proved  that  the  algorithm  converges  in  a  finite  number  of 
iterations. 

The  ability  to  determine  such  a  feasible  interval  is  useful  for  exploring 
the  neighbourhood  of  the  optimum,  and  can  be  practically  used  in  man¬ 
ufacturing  considerations.  The  numerical  properties  of  the  algorithm  are 
tested  and  demonstrated  by  an  example  problem,  and  the  procedure  is  ap¬ 
plied  to  a  real  life  engineering  design  problem  to  construct  manufacturing 
tolerances  for  an  optimum  design  of  composite  materials. 
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I  -  INTRODUCTION 


This  paper  investigates  the  properties  of  a  class  of  integer 
programming  model  applied  to  a  production  planning  problem. 
Each  product  may  require  processing  on  several  machines  and  may 
involve  precedence  relationships.  The  machines  are  already 
(partially)  committed  and  only  the  residual  capacities  of  each 
machine  in  each  time  period  of  the  planning  horizon  are  available 
for  use.  The  problem  is  to  efficiently  deploy  unused  capacities 
by  determining,  subject  to  market  conditions,  a  production 
schedule.  The  mcdel  lies  at  the  heart  of  a  decision  support 
system  for  advising  sales  executives  in  determining  the  products 
on  which  to  focus  their  efforts.  The  models  can  be 
computationally  demanding  and  techniques  for  speeding  up  solution 
times  are  highly  desirable.  Various  preprocessing  techniques 
have  been  investigated  and  their  effectiveness  evaluated.  In 
addition,  a  number  of  cutting  plane  approaches  have  been  applied. 
The  performance  of  these  approaches  which  are  both  general  and 
application  specific  is  examined. 
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II  -  THB  MODEL 


The  production  environment  is  made  up  of  a  set  of 
manufacturing  cells,  each  of  which  may  have  an  amount  of 
unallocated  capacity  (resource)  in  each  of  a  set  of  time  periods 
over  the  planning  horizon.  Each  product  can  be  produced  according 
to  a  number  of  production  structures,  each  of  which  specifies  the 
cell  resources  required  per  unit  of  product  in  each  of  the 
(cell, time  period)  combinations  used  in  the  structure. 

Let  the  following  parameters  define  the  size  of  the  problem: 

np  =  number  of  products 
ns  =  number  of  production  structures 
nc  =  number  of  manufacturing  cells 
nt  *  number  of  time  periods 

And  let  (i,j,k,t)  be  the  index  set  as  defined  below. 

i  :  product,  i  =  1. . .np 
j  ;  production  structure,  j  =  l...ns 
k  :  manufacturing  cell,  k  =  1. .  ,nc 
t  :  time  period,  t  -  l...nt 


The  problem  data  and  the  variables  are: 

^Pijkc  ^  amount  of  capacity  used  per  unit  of  product  i 
produced  according  to  structure  j,  on  cell  k  , 
in  period  t 

sp^^  =  the  spare  capacity  for  cell  k  in  period  t 
limprod^  =  market  size  for  product  i 

»  profit  per  unit  of  product  i 
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lij  =  the  minimm  production  (juantity  of  product  i, 
when  produced  according  to  structure  j 

u^j  =  the  maximum  production  Quantity  of  product  i, 
when  produced  according  to  structure  j 

=  number  of  units  of  product  i,  to  be  produced 
according  to  production  structure  j 
f  1  if  product  i  is  produced  according  to  structure  j 
I  0  otherwise 

The  problem  can  be  formulated  as: 


maoc 

St: 


np  ns 

i-l  j-i 

np  ns 


E  E  i  sp^^ 

i-l  j-l 


k=l . .nc 
t=l . .nt 

ns 

E 

Xij  i  limprod^ 

(2) 

j’l 

i  -  1. .np 

-  ^ij  Yii  ^  0 

(3) 

^ii 

-  “li  Yii  i  0 

(4) 

i  =  1 . . np 

J  =  1 .  .  ns 

^ii 

z  0  and  integer,  y^j  e  {0,1) 

Constraint  set  (1)  states  that  the  total  amount  of  resource 
used  in  cell  k,  time  period  t,  must  not  exceed  the  spare 
capacity;  constraint  set  (2)  represents  the  market  size  for  each 

t 

product  i;  and  constraint  sets  (3)  and  (4)  state  the  minimum  and 
maximum  quantity  of  product  i,  when  produced  according  to 
structure  j . 
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III  -  PREPROCESSING 

In  order  to  obtain  a  tighter  formulation  to  the  problem  the 
following  preprocessing 'techniques  have  been  investigated. 

1}  Euclidean  reduction  [HP-91] 

1) For  each  capacity  row  (constraint  set  (1))  let  kexp  be  the 
smallest  nonnegative  integer  such  that: 

is  integer  for  all  i,  j 

ii) Find  K  the  greatest  common  divisor  of  the  resulting  integer 
coefficients; 

iii)  Multiply  the  row  by  10^  and  divide  both  sides  by  K; 

iv) If  the  RHS  after  the  division  is  not  integer  a  tighter 
representation  may  be  derived.  Since  the  constraint  type  is  'less 
than  or  equal'  the  RHS  can  be  truncated  to  the  next  lowest 
integer. 

2)  Redundant  constraints  (BMW-75) 

For  each  capacity  row: 

i)  Compute  the  constraint  upper  bound: 

Since  is  nonnegative  for  all  (i.j,k,  t)  : 

np  nm 

ii) The  constraint  is  redundant  if: 


i  ®Pjtt 
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3) Singleton  rows  [BMW-75] 

Consider  a  capacity  constraint  such  that: 


Then  let: 


«p 

a«l  j‘l 

where  cp^jj^^=0  for  all  i*g  and  j*h 


u 


gh 


_  gPtt 
^Pghkt 


If: 

i) “ib  ^  Ufii  replace  by  L  J 

ii) u,k  <  l,fc  fix  x,fc  =  0  and  y^^  =  0 
Remove  the  constraint. 


4) Infeasibility  and  simple  redundancy  [HP-91] 

In  this  form  of  preprocessing  redundant  constraints  may  be 
identified  and  removed.  The  procedure  is  as  follows: 

i) In  each  capacity  row,  determine  the  nonzero  count. 

ii) For  the  rows  with  equal  nonzero  count  determine  rows  whose 
nonzeros  match  exactly  both  in  terms  of  value  and  column  number. 

iii) Hhen  two  rows  agree,  check  their  RHS: 

a)  if  the  RHS  are  equal,  remove  one  of  the  rows; 

b) if  one  inequality  dominates  the  other,  remove  the  dominated 
one. 

This  technique  can  be  extended  to  detect  infeasibilities  in 
the  form  of  conflicting  constraints.  However  the  capacity 
constraints  of  the  model  are  all  'less  than  or  equal  to'  and 
therefore  this  possibility  cannot  arise. 
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IV  -  CPTTIMG  PLANES 


The  most  efficient  way  to  get  a  tighter  formulation  for  an 
integer  problem  is  to  in^rporate  strong  valid  inequalities.  We 
have  investigated  two  classes  of  valid  inequalities. 

The  first  relates  to  a  reduced  form  of  the  model.  Consider 
the  bounding  constraints  (3)  and  (4) .  Replace  these  constraints 
by  the  aggregate  bounding  constraints  (5)  and  (6) . 


n« 


-  J-tt  Yli^  ^  0 

(5) 

nm 

E 

-  Yij)  i  0 

(6) 

i  =  1 . . np 


The  new  model  is  a  relaxation  with  2np(ns-l)  fewer  bounding 
constraints.  If  any  of  the  original  bounding  constraints  are 
broken  they  can  be  introduced  as  cuts  to  the  new  model. 

For  the  second  approach,  we  consider  the  capacity  rows 
together  with  constraints  (3)  and  (4)  and  reformulate  them  as  a 
single  node  fixed  charge  flow. 

Given  : 


s 


nt 

g  cPij*e  i  sPjte 

k  =1. .nc,  t  =  1. .nt 


-  ^iS  Yij  ^  0 

-  “li  Yij  ^  0 

i  »  1 . .  np,  j  ■  1 . .  ns 


Let: 

x\  -  cp^X,,  u’,  •=  cp^u^.,  1‘,  -  cp^l, 

and  relax  Xy  to  x^  e  E 
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A  production  planing  model  to  advise  sales  executives  on  the 
products  on  which  they  should  concentrate  in  order  to  efficiently 
deploy  unused  factory  capacities  has  been  developed.  To  speed  up 
the  solution  process,  a  set  of  preprocessing  techniques  and  valid 
inequalities  have  been  investigated.  Experimental  results  for  a 
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range  of  model  sizes  (to  be  reported)  indicate  that  the 
procedures  have  a  beneficial  effect. 
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I.  INTRODUCTION 

The  application  considered  in  this  paper  is  the 
determination  of  an  annual,  day-by-day,  schedule  for  a 
fleet  of  United  States  Coast  Guard  cutters  within  a  given 
geographical  area.  The  tasks  assigned  to  the  cutters  are 
varied  and  include  patrolling  in  specified  districts  of  the 
area,  training  exercises  and  maintenance.  The  feasibility 
of  a  particular  cutter  schedule  is  governed  by  a  set  of 
operational  rules  that  depend,  in  part, •on  the  timing  and 
nature  of  the  tasks  already  assigned  to  the  cutter  prior  to 
the  start  of  the  scheduling  year  in'  question.  Other 
factors  include  transit  allowances  before  and  after  a  task, 
the  duration  of  in-port  time  after  the  completion  of  a  task 
and  cutter  capabilities.  The  requirements  placed  on  the 
fleet  fall  into  two  principal  categories.  The  first 
relates  to  minimum  levels  of  cover  in  terms  of  the  number 
of  cutters  of  given  classes  on  patrol  in  each  area  at  any 
given  time.  The  second  category  of  requirement  concerns 
training  and  cutter  maintenance.  In  these  cases  upper 
limits  are  placed  on  the  number  of  cutters  undergoing  these 
tasks  at  any  given  time. 

One  approach  to  modelling  scheduling  problems  of  this  type 
is  to  generate,  for  each  cutter,  a  set  of  possible 
schedules,  and  to  determine  the  'best'  fleet  schedule  by 
selecting  one  of  the  possible  schedules  for  each  cutter. 
This  formulation  leads  to  an  integer  programming  model 
which  has  been  widely  advocated  (e.g.  [2])  in  various 

guises. 

However,  in  many  scheduling  applications,  especially  in  the 
area  of  vehicle  scheduling,  the  list  of  'requirements' 
often  results  in  the  non-existence  of  a  feasible  solution. 
In  such  cases,  a  relaxation  of  the  requirements  is 
necessary  in  order  to  obtain  a  schedule.  On  further 
examination  of  the  practical  issues  it  is  frequently  the 
case  that  some  of  the  'requirements'  merely  reflect 
desirable  characteristics  rather  than  strict  requirements. 
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In  developing  a  scheduling  model  that  yields  useful 
solutions,  Darby-Dotman  and  Mitra  [1]  proposed  the  extended 
set  partitioning  model  which  was  essentially  an  integer 
goal  programming  formulation  of  a  set  partitioning  model 
and  admitted  set  covering,  set  packing  and  set  partitioning 
as  special  cases. 

In  that  model,  the  reguirements  were  treated  as  targets  and 
undercover  (below  target)  and  overcover  (above  target)  were 
allowed  but  penalised.  Our  approach  to  the  cutter 

scheduling  problem  follows  a  similar  vein. 

II.  MODELLING  THE  8CHE0DLIMG  PROBLEM 

Within  the  operational  rules  that  govern  the  tasks  and  the 
cutter  duties,  a  set  of  possible  schedules  is  generated  for 
each  cutter.  An  optimal  schedule  is  one  which  is  as  close 
to  meeting  requirements  as  is  possible.  The  generic  model 
is  stated  below: 

Parameters:  nc  =  number  of  cutters  to  be  scheduled 

n^  =  number  of  columns  (possible 
schedules)  for  cutter  k, 
k  =  1 , 2 , . . . nc 

nt  =  number  of  time  periods  in  the 
scheduling  year 

ng  »  number  of  constraint  groups  (schedule 
' requirements ' ) 

Index  sets 

i  :  schedule  requirement,  i  =  l,2,...ng 

j  :  time  period,  j  =  l,2,...nt 

k  :  cutter  identifier  k  =  l,2,...nc 

1  :  identifier  of  possible  schedule  for 

cutter  k 
1  *  1,2,...  n^ 

Model  Variables 

Xu  =  1  if  the  I'th  possible  schedule  for  cutter  k 

is  selected 

=  0  otherwise 

u^  :  Extent  of  the  under-achievement  in  respect 

of  schedule  requirement  i  in  time  period  j. 

o^  :  Extent  of  the  over-achievement  in  respect  of 

schedule  requirement  i  in  time  period  j . 


1 


a 


iiu  - 


=  0 


if  the  I'th  possible  cutter  schedule  for 
cutter  k  contributes  to  schedule  requirement 
i  in  time  period  j . 

otherwise. 


Tjj  :  target/ limit/ threshold  for  schedule 

requirement  i  in  time  period  j  in  terms  of 
number  of  cutters  contributing  to  the 
schedule  requirement. 

Wjj(iO)  ;  Penalty  for  each  unit  of  under-achievement 

in  respect  of  r^ 

w*ij(2:0)  :  Penalty  for  each  unit  of  over-achievement  in 

respect  of  rjj 

Model 


nC 

Min^  E  Oij) 

i  -  1  J  -  1 


subject  to 


nc 

E 

*  - 1 


E 


=  r 


ij 


i  =  1.2, 
j  =  1.2. 


.ng 

.nc 


"k 

E  ^ki  =  ^  k  =  1.2 - nc 

1  -  X 


€[0,1] 


k  =  1.2, 
k  =  1.2. 


nc 


Remarks  on  the  Model 

(1)  The  model  is  stated  in  a  generic  form.  In  any  given 
application  instance,  simplifications  may  be  possible. 
For  example,  if  schedule  requirement  i  is  such  that  r^ 
represents  a  desired  lower  limit  then  the  penalty  for  over¬ 
achievement,  can  be  set  at  zero  and  o^^  considered  as  a 

logical  variable.  Similarly,  if  schedule  requirement  i  is 
such  that  rj^  represents  a  desired  upper  limit  then  the 
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penalty  for  under -achievement,  w'y  can  be  set  at  zero  and 
considered  as  a  logical  variable. 

(2)  The  number  of  time  periods  in  the  scheduling  year  is 
a  matter  for  judgement  in  relation  to  the  individual 
application.  A  day-to-day  schedule  covering  one  year  is 
required.  Thus  in  its  simplest  form,  nt  equals  365  or 
366.  However,  model  size  and  hence  solution  time  can  be 
reduced  by  considering  a  larger  time  unit.  In  certain 
cases  this  can  be  achieved  without  loss  of  model  validity. 
For  .example,  if  the  duration  of  tasks  and  activities  is 
always  an  integral  number  of  weeks,  the  problem  can  be 
modelled  on  a  week-to-week  basis  with  nt  reduced  to  53. 
Even  if  the  duration  of  tasks  and  activities  is  not  always 
an  integral  number  of  weeks  it  may  still  be  worthwhile  to 
adopt  the  time  unit  change  in  order  to  obtain  solutions 
more  quickly  with  a  possible  sacrifice  on  solution  quality. 
This  aspect  is  investigated  in  section  3 . 

(3)  The  cutter  scheduling  problem  reported  here  has  the 

following  size  parameters.  There  are  30  cutters  to  be 
scheduled.  There  are  9  sets  of  schedule  requirements,  5 
of  which  represent  desired  minimum  levels  with  the 
remaining  4  representing  desired  upper  limits.  The 

minimvun  levels/upper  limits  are  invariant  through  time  and 
range  from  a  value  of  i  to  a  value  of  5. 

III.  TIME  ONIT  COMPRESSION 

A  major  factor  influencing  the  difficulty  with  which  the 
model  may  be  solved  is  the  number  of  constraints.  Each 
scheduled  requirement  results  in  (nt)  constraints.  As 
stated  in  the  previous  section  the  most  natural  form  of  the 
model  has  (nt)  equal  to  the  number  of  days  in  the  year  (365 
or  366).  With  9  (sets  of)  schedule  requirements,  the 
model  would  have  over  3000  rows.  In  addition  the  columns 
are  very  dense  with,  typically,  200-250  nonzeros  per 
column.  The  idea  of  developing  a  'coarser'  model  in  which 
each  time  period  is  increased  in  size  (e.g.  from  1  day  to 
7  days  »  1  week)  is  attractive  in  significantly  reducing 
the  size  of  the  model  both  in  terms  of  the  number  of  rows 
and  the  number  of  nonzeros.  Clearly  there  may  be  a 
reduction  in  solution  quality  since  the  model  may  be  a  less 
precise  description  of  the  scheduling  problem. 

The  activities  performed  by  the  cutters  involve  patrolling, 
maintenance  and  training.  Each  of  these  activities  takes 
place  in  various  forms.  For  example,  the  area  within 
which  patrolling  takes  place  is  divided  into  various 
districts,  each  with  its  own  requirements  in  terms  of 
cutter  coverage.  Additionally  there  are  activities  such 
as  transit  between  tasks  and  necessary  time  spent  in  port. 
With  the  exception  of  transit  times,  the  required  durations 
or  range  of  durations  of  the  tasks  tend  to  be  specified  in 
terms  of  an  integral  number  of  weeks  and,  as  a  consequence, 
the  time  unit  compression  from  days  into  weeks  appears  more 
likely  to  be  viable.  Some  tasks  (e.g.  training  and 
maintenance)  are  required  to  start  on  a  specified  day  of 
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the  week  -  but  not  necessarily  the  sane  day  of  the  week. 
It  is  for  this  reason  that  the  weekly  model  introduces  an 
element  of  approximation  compared  to  the  daily  model. 
Consider  the  example  shown  in  Figure  1.  Suppose  that  the 
time  unit  compression  is  such  that  each  Monday  through 
Sunday  time  period  of  1_  consecutive  days  is  considered  as 
one  new  time  unit.  Suppose  further  that  a  training  task 
is  required  to  start  on  a  Monday  and  a  maintenance  task  is 
required  to  start  on  a  Wednesday.  Then  in  the  example, 
training  takes  place  in  week  k  since  it  takes  place  on 
every  day  of  week  k.  However,  the  maintenance  task  takes 
place  only  for  part  of  week  k.  In  converting  from  a  daily 
to  a  weekly  model,  the  cjuestion  of  whether  a  given  task 
takes  place  in  a  given  week  must  be  addressed. 


Dav: . . .F.S.i 

M.T.W.T.F.S.S 

1  M.T.W.T - 

Week  k-1 

Week  k 

i  Week  k  + 

Training 


Maintenance 

Figure  1;  Task  start  davs  example 


The  proposed  model  treats  this  issue  conservatively  such 
that  a  feasible  weekly  schedule,  when  converted  back,  will 
necessarily  yield  a  feasible  daily  schedule.  To  achieve 
this,  the  treatment  depends  on  the  type  of  constraint 
considered.  The  schedule  requirements  are  specified  in 
terms  of  desired  upper  limits  (implied  less  than  or  equal 
to  constraints)  or  desired  lower  limits  (implied  greater 
than  or  equal  to  constraints)  .  In  the  former,  over¬ 
achievement  is  penalised  whilst  in  the  latter,  under¬ 
achievement  is  penalised. 

The  constraint  coefficients  of  the  weekly  model  are 
therefore  determined  as  follows: - 

If  the  constraint  group  (schedule  requirement) 
i  is  a  'desired  upper  limit'  type  constraint 
then 

aj^  =1  if  the  I'th  possible  (daily) 

schedule  for  cutter  k  covers 
all  7  davs  of  week  j 

s  0  otherwise 

If  the  constraint  group  (schedule 
requirement)  i  is  a  'desired  lower  limit' 
type  constraint  then 
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=  1  if  the  I'th  possible  (daily) 

schedule  for  cutter  k  covers 
any  dav  of  week  j. 

-  0  otherwise 

The  overall  modelling  strategy  is  illustrated  in  Figure  2. 
The  daily  model  is  generated  and  converted  to  a  weekly 
model  as  described  above.  The  weekly  model  is  then  solved 
and  the  solution  in  terms  of  a  weekly  schedule  for  each 
cutter  is  obtained.  These  weekly  schedules  are  matched 
with  the  schedules  of  the  daily  model  to  obtain  daily 
schedules  which  are  then  post-processed.  The  post- 

processinc  performs  a  series  of  guick  local  optimisations 
which  apply  daily  time  shifts  to  patrolling  tasks  whenever 
such  shifts  lead  to  local  improvements 


Fimira  2r  Overall  Modelling  Strategy 
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IV.  SOMMARY 

A  practical  model  for  determining  an  annual  day-by-day 
schedule  for  a  fleet  of  United  States  Coast  Guard  cutters 
has  been  developed.  The  model  was  made  computationally 
more  tractable  by  considering  a  smaller  number  of  larger 
time  periods.  Results  (to  be  reported)  indicate  that 
little  is  sacrificed  in  terms  of  solution  quality  by 
adopting  this  form  of  time  unit  compression. 

In  common  with  many  scheduling  problems,  the  simultaneous 
satisfaction  of  all  requirements  may  not  be  possible.  The 
use  of  what  is  essentially  an  integer  goal  programming 
model  ensures  that  model  feasibility  is  assured.  The 
solution  is  either  a  completely  satisfactory  schedule  or  as 
near  to  one  as  is  possible. 

The  model  lies  at  the  heart  of  a  decision  support  system 
that  is  in  the  process  of  being  implemented  by  the  United 
States  Coast  Guard. 
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1.  INTRODUCTION 

Languages  for  rqpresenting  Linear  Programming  Models  for  optimization  are  well 
established.  These  languages  follow  a  simple  algd>raic  structure  to  represent  the 
linear  form  restrictions  and  are  adequate  for  a  vast  range  of  LP  models. 

There  are  many  experimental  and  commercial  systems  of  this  genre  which  are  used 
by  industry,  for  an  uptodate  review  of  such  systems  the  reader  is  referred  to  (Steiger 
and  Shar^,91),  (Greenberg,91).  Most  modem  modelling  systems  enable  the 
modeller  to  specify  models  in  a  declarative  algebraic  language.  A  set  of  algebraic 
statements  in  a  modelling  language  both  specifies  and  documents  a  model,  whereas 
the  generation  of  a  machine  readable  constraint  matrix  takes  place  in  the  background. 

It  is  now  increasingly  realized  that  alternative  modelling  paradigms  such  as  daUdMse 
modelling,  mathematical  programming  modelling,  simulation,  logic  programming  and 
programs  written  in  a  high  level  computer  language  are  essentially  different  forms  of 
knowledge  representations  as  percdved  by  the  AI  Community  (Geoffrion,  1990), 
(Mitra,  1989).  Knowledge  expressed  in  a  declarative  form  and  knowledge  specified 
in  a  procedural  form  are  two  main  approaches  to  knowledge  rqiresentation. 

In  this  paper  we  first  identify  an  important  deficiency  of  many  known  Mathematical 
programming  modelling  languages.  These  languages  are  well  designed  to  rq)resent 
large  classes  of  LP  and  IP  models  in  the  declarative  form.  A  wide  class  of  other 
optinuzation  models  applying  to  many  real  problems  such  as  crew  scheduling,  cutting 
stock,  VLSI  routing  and  ship  scheduling  provide  instances  of  models  which  are  highly 
dqxndent  on  domain  knowledge.  For  these  models  the  domain  knowledge 
concerning  the  rules  of  crew  duties,  alternative  ways  of  defining  cutting  patterns, 
possible  minimum  cost  routes,  a  set  of  tasks  around  calendar  dates  can  be  only 
specified  in  a  procedural  form.  Although  modelling  systems  are  well  set  out  to 
structure  the  model  components,  by  their  very  nature  these  modelling  systems  lack 
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the  procedural  constructs.  As  a  result  activity  based  LP  models  which  are 
constructed  by  a  column  generation  strategy  across  a  fixed  structure  of  rows,  caimot 
be  developed  within  these  systems. 

We  have  introduced  extensions  to  an  established  LP  modelling  system  namely  MPL 
whereby  the  procedural  knowledge-  is  introduced  through  a  dynamic  binding  of  the 
high  level  modelling  language.  We  also  introduce  object  orientation  thus  taking 
advantage  of  encapsulation,  inheritance,  and  information  hiding.  In  this  way  we  can 
capture  procedural  knowledge  in  the  form  of  methods  within  self  contained  objects. 
The  extension  is  illustrated  by  an  example  of  an  optimum  cutter  scheduling  prt^lem 
studied  by  the  authors. 

ACTIVITY  BASED  LP/IP  MODELS 

As  explained  in  the  introduction  many  practical  models  are  such  that  the  underlying 
LP  can  only  be  constructed  if  the  activities  specifying  the  technology  matrix,  that  is, 
the  columns,  are  computed  using  the  domain  knowledge  of  the  application.  We 
consider  a  few  examples. 

Crew  Scheduling:  Both  air  crew  (Johnson  (1990))  and  bus  crew  scheduling  (Darby- 
Dowman  and  Mitra,  1985)  problems  are  by  nature,  set  partitioning  or  set  covering 
problems  or  their  extensions.  The  rows  represent  legs  of  flight  or  pieces  of  work 
which  must  be  covered  by  the  crew.  TTie  columns  represent  ways  of  carrying  out 
a  work  shift  that  is  legal  within  union  regulations  and  accepted  practises. 

VLSI  Routing:  VLSI  routing  (Pulieyblank,  1992)  is  a  well  known  combinatorial 
problem  which  has  the  structure  of  a  set  problem  in  which  columns  are  generated 
after  solving  a  travelling  salesman  problem. 

Cutting  Stock  Problem:  One  and  two  dimensional  cutting  stock  problems 
(Gomory,  1965)  have  many  industrial  applications  in  the  area  of  minimum  wastage  of 
sheet  material.  Here  again  the  columns  are  constructed  by  a  combinatorial  procedure 
for  fitting  patterns  within  a  two  dimensional  master  area.  Alternatively  by  solving 
a  knapsack  problem  using  dynamic  programming  recursion  efficient  patterns  and 
corresponding  columns  can  be  generated. 

Cutter  Scheduling  Problem:  We  have  developed  a  model  (Darby-Dowman,  1992) 
which  uses  integer  goal  programming  in  respect  of  a  set  problem  wi^  some  additional 
choice  constraints.  The  problem  is  set  out  in  the  next  section. 

2.  STATEMENT  OF  THE  CUTTER  SCHEDULING  PROBLEM 

The  problem  involves  creating  an  annual  schedule  for  nc  number  of  cutters  to  carry 
out  tasks  such  as  Patrol,  Maintenance,  training  whereby  the  schedule  specifies  the 
activity  of  each  cutter  for  each  day  of  the  year. 
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Define  the  parameters 

nc  =  number  of  cutters  to  be  scheduled, 

Ilk  =  number  of  columns  (possible  schedules)  for  cutter  k, 

nt  =  number'of  time  periods  in  the  scheduling  year, 

ng  =  number  of  constraint  groups  (schedule  ’requirements’). 

The  corresponding  index  sets 
i  :  i  =  l,2,...,ng;  schedule  requirements, 

t:  t  =  1, 2,..., nt;  time  period, 

k:  k  =  1, 2,..., nc;  cutter  identifier 

t  :  t  =  1, 2,... ,0^;  identifier  of  possible  schedule  for  cutter  k, 


Model  Variables 

Xu  =1  if  the  f*  possible  schedule  for  cutter  k  is  selected  0  otherwise, 

Uj,:  extent  of  the  under-achievement  in  respect  of  schedule  requirement  i  in  time 
period  t, 

Oj,;  extent  of  the  over-achievement  in  respect  of  schedule  requirement  i  in  time 
period  t, 


Model  Coefficients 

a^ki  =  1  if  the  possible  cutter  schedule  for  cutter  k  contributes  to  schedule 
requirement  i  in  time  period  t,  0  otherwise, 

Tj,:  target/limit/threshold  for  schedule  requirement  i  in  time  period  t  in  terms  of 
number  of  cutters  contributing  to  the  schedule  requirement, 

w'j|(^0):  penalty  for  each  unit  of  under-achievement  in  respect  of  rj,, 

w^ji(^0):  penalty  for  each  unit  of  over-achievement  in  respect  of  r^ 


The  model  is  stated  as, 


Of  oe 

Min  £  5^ 

i  ■  1  e  •  1 
subjectto 

ae 

Jt  -  1  i  •  1 
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i  •  1 
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~  0 


it 


■it  t  ® 


1,2,  .  . 

1,2,  .  . 
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k  •  1,2,  ... ,nc 
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3.  COLUMN  GENERATION  PSEUDO  CODE 


a_cut_schedule  (Cutter, Class, Port,Tasks,Time) 

:  do  for  all  cutters. 


Analyse  DATES 


^/maintenance  and  training  tasks/* 

task  order  (1) 

task  order  (2) 

task  order  (3) 

loopend 


:  do  for  task  group  one 

;  go  to  task  order  =  1,2,3} 

:  allocate  maintenance  to  this  cutter  group 
go  to  loop  end 

:  allocate  reftraining  to  this  cutter  group 
go  to  loop  end 

:  allocate  training  availability  to  this  cutter  group 
go  to  loop  end 
:  endo 


*/patrol  tasks/* 


:  do  for  allowable  patrol  tasks 


patrol  order  (1) 


:  allocate  patrol 
go  to  patrol  end 


patrol  order  (4) 
patrol  end 
cutters  end 


allocate  patrol 

endo 

endo 


4.  EXTENDED  MPL  SYNTAX 

MPL  is  a  modelling  language  for  specifying  linear  programming  problems.  As  in 
any  other  algdiraic  LP  modelling  language  the  model  can  be  specified  by 
progressively  introducing  a  series  of  keywords  which  divide  the  model  components 
across  sections.  The  syntax  and  structure  of  MPL  is  set  out  below  in  a  summary 
form,  (Maximal,  1991) 
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The  Definition  Part 


TITLE 

INDEX 

DATA 

DECISION 

MACRO 

The  Model  Part 


Names  the  problem 
The  dimension  of  the  problem 
Scalars,  datavectors  and  datafiles 
Define  vector  variables 
Reusable  macros  for  expressions 


MODEL 
MAX  or  MIN 
SUBJECT  TO 
BOUNDS 
FREE 
INTEGER 
BINARY 
END 


Description  of  the  problem 
The  objective  function 
The  constraints 

Simple  upper  and  lower  bounds 
Free  variables 
Integer  variables 
Binary  (0/1)  variables 


TYPICAL  LINEAR  FORM  SYNTAX 

SUM  (< index >  :  < table  ref>*<decision  variable >) 

<  relation  >  <  table  ref  > 


SCHEDULING  MODEL  SPECIFIED  IN  EXTENDED  MPL 


TITLE 

csap_schedule 

INDEX 

nmbcutters  =  1..30; 

((DYNAMIC  npos  (nmbcutters) 
notimeperiod  =  1..53; 
notasks  -  1..9; 
patrol  =  (D1,D3,D5,D7); 
maintenance  =  (drydock,  dockside); 

DATA 

tasklimits[patrol]  :  =  DATAFILE  (tlimpat.dbs); 


oversatcost[notasks, notimeperiod]:  =DATAFILE(ostcost.dbs); 
undersatcost[notasks,notimeperiod]:  =  DATAFH^ustcost.dbs); 
coverreqmtlnotasks, notimeperiod]:  =  DATAFILE(cover.dbs) 

DECISION  VARIABLES 

oversat(notasks,notimq)eriod]; 
undersat[notasks, notimeperiod] 

#DYNAMIC  xschedule[npos(nmbcutters)]; 
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MODEL 

MIN  deviation  =  SUM(notasks,notimeperiod:  oversat^oversatcost  + 
undersat*undersatoo^); 


SUBJECT  TO 

COVER[notasks,notimq}eriod]: 

SUM(nmbcutter,npos:a_cut_schedule*xscheduIe) 

+  undersat  -  oversat  =  coveneqmt; 

CHOOSE  I[nmbcutters];  SUM(npos:xscheduIe)  =  I  ; 

BINARY 

xschedule; 


END 
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1  Introduction 

Given  a  machine  which  can  process  at  most  one  task  at  time,  and  a  set  T  =  {Tj, . . . ,  T„} 
of  n  tasks  with  associated  processing  times  deadlines  flow  time 

penalties  and  earliness  penalties  the  Single  Machine  Scheduling 

Problem  with  Earliness  and  Flow  Time  Penalties  {SMEF)  is  to  determine  a  processing 
sequence  for  the  tasks  that  minimizes  the  total  cost  incurred  by  the  penalties,  while 
preserving  deadline  requirements  of  each  task.  The  processing  cost  associated  with  each 
task  Tj  is  equal  to  its  completion  time  Cj  multiplied  by  the  flow  time  penalty,  plus 
its  earliness  Ej  —  d,  —  Cj  multiplied  by  the  earliness  penalty.  Using  the  three-field 
classification  introduced  in  Graham,  Lawler,  Lenstra  and  Rinnooy  Kan  [7],  the  problem 
is  denoted  as  l|dj|  fdjEj). 

We  assume  that  processing  times,  deadlines  and  penalties  are  positive  integers,  that 
tasks  are  available  at  time  zero,  that  setup  times,  if  any,  are  identical  and  included  in  the 
processing  time  and  that  preemption  of  tasks  is  not  allowed.  A  schedule  (i.e.  a  solution 
for  problem  SMEF)  is  defined  through  the  vector  (CijCj, . . .  ,C„)  of  the  completion 
time  of  the  tasks:  task  Tj  is  processed  in  time  interval  (Cj  —  Pj,Cj] 

The  flow  time  penalty  has  classically  been  used  to  model  overhead  and  capital  car¬ 
rying  costs  sustained  during  production,  while  the  earliness  penalty  takes  into  account 
the  cost  incurred  for  storing  a  finished  product  until  it  is  shipped. 

The  problem  is  strongly  NP-hard,  since  it  is  a  generalization  of  the  single  machine 
scheduling  problem  calling  for  the  minimum  weighted  flow  time  sequence  with  no  tardy 
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task  (l\dj\J^ajCj)  which  is  known  to  be  NP-hard  in  the  strong  sense  (see  Lenstra, 
Rinnooy  Kan  and  Bcuckei  [8]).  An  exact  algorithm  for  SMEF,  based  on  a  dynamic 
programming  approach,  has  been  developed  by  Bard,  Venkatraman  and  Feo  [2].  Feo, 
Venkatraman  and  Bard  [4]  have  recently  presented  a  heuristic  algorithm  based  on  a 
Greedy  Randomized  Adaptive  Sear&  Procedure  (GRASP).  Special  cases  and  related 
problems  have  also  been  studied  by  Fry  and  Leong  [6],  Bagchi  and  Ahmadi  [1],  Faaland 
and  Schmitt  [5],  and  Sen,  Raizadeh  and  Dileepan  [10]. 

In  the  following  sections  we  develop  lower  bounds  and  an  approximation  algorithm  for 
SMEF  and  show,  through  computational  experiments,  the  effectiveness  of  the  proposed 
approaches.  In  Section  2  we  present  some  simple  lower  bounds  and  a  better  one  based  on 
a  preemptive  relaxation  of  the  problem.  In  Section  3  we  use  the  preemptive  lower  bound 
to  obtain  an  approximation  algorithm.  The  approximation  algorithm  is  experimentally 
analyzed  in  Section  4. 

Unless  otherwise  specified,  we  will  always  assume  that  the  tasks  are  numbered  so 
that: 


d\  ^  <  . . .  <  dn . 


(1) 


2  Lower  bounds 

The  objective  function  of  SMEF  can  be  written  as: 

z{SMEF)  =  min  ^^(ayC,-  +  /8y(dy  -  Cj)  -  +  min  J^wyCy  = 

j=\ 

=  j20idi  +  z{SMEF%  (2) 

7=1 

where  Wj  =  ay  -  /3y  is  the  overall  penalty  of  task  Ty. 

2.1  A  simple  bound 

We  can  partition  T  into  TR  =  {Ty  €  T  :  iwy  <  0}  and  TL  =  {Ty  £  T  :  Wj  >  0}.  These 
two  subsets  contain  tasks  that  have  different  behaviour  in  an  optimal  schedule:  the  tasks 
of  set  TR  require  to  be  processed  as  late  as  possible,  while  those  of  set  TL  must  be 
scheduled  as  soon  as  possible. 

Let  Pfi  (resp.  Pi)  denote  the  sub-instance  of  SMEF'  in  which  only  the  tasks  in  TR 
(resp.  TL)  are  considered,  and  let  z{Ph)  (resp.  z{Pl))  denote  the  corresponding  solution 
value.  We  will  consider  the  relaxation  of  SMEF'  obtained  by  assuming  that  a  task  in 
T R  can  be  processed  in  parallel  with  a  task  in  T L:  the  optimal  solution  to  this  problem  is 
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clearly  provided  by  the  separate  solutions  to  Pn  and  so  £  =  0jdj+z(Pit)+z(Pi) 
is  a  lower  bound  on  z{SMEF). 

Since  any  instance  of  l|rj|  XI  which  is  known  to  be  strongly  NP-hard  (see 

Lenstra,  Rinnooy  Kan  and  Brucker  (8]),  can  be  easily  transformed  into  an  equivalent 
instance  of  Pr,  we  know  that  this  problem  too  is  strongly  NP-hard.  Problem  Pi  is  the 
already  mentioned  l|dj|  XI  WjCj,  which  is  also  known  to  be  NP-hard  in  the  strong  sense. 
Hence  above  lower  bound  L  cannot  be  computed  in  polynomial  time,  but  we  can  deter¬ 
mine  lower  bounds  L{Pr)  and  L{Pi)  for  the  two  subproblems,  obtaining  lower  bound 

l  =  eUM  +  HPr)  +  ^Pl). 

A  lower  bound  for  problem  Pr  can  be  obtained  by  allowing  that  more  than  one  task 
of  TR  can  be  processed  at  a  time.  The  optimal  solution  is  clearly  obtained  in  0(n)  time 
by  scheduling  each  task  as  late  as  possible,  i.e.  setting  Cj  =  dj  for  each  Tj  ^  TR,  and 
its  value  is: 


Lo(Pr)=  E  (3) 

TjeTR 

A  lower  bound  for  problem  P^  can  be  computed  by  relaxing  the  deadline  con¬ 
straints,  obtaining  the  problem  which  can  be  exactly  solved  (see  Smith  [11]) 

in  (9(nlogn)  time  by  scheduling  the  tasks  in  order  of  decreasing  value  of  the  ratio  VJjfpj: 
let  Lo{Pl)  denote  the  solution  value.  Then 

Lo  =  't0,d,  +  Lo{PR)  +  Lo{PL),  (4) 

>*« 

is  a  valid  lower  bound  for  SMEF. 

The  time  complexity  for  the  computation  of  Lq  is  clearly  C)(n  log  n). 

2.2  A  new  lower  bound 

Let  us  consider  the  following  new  problem,  called  S{SMEF')  in  the  sequel,  derived  from 
SMEF'  by  allowing  that  each  task  Tj  can  be  split  into  k{j)  pieces  Tj,,...,  Tj,^^.^  with 
deadlines  dj.  =  dj  for  each  i  and  j,  positive  processing  times  Pj,,...,  such  that 
Pj,  =  Pj  for  each  j,  and  weights  Wj, ,...,  Wj^^y  having  the  same  sign  as  Wj  and  such 
that  Xlfif?  «'>,•  =  "'‘ij  for  each  j. 

Let  Cj,  be  the  completion  time  of  piece  Tj,:  the  objective  function  of  S{SM EF')  is 

n  *(» 

(5) 

>=i  i=\ 

Posner  [9]  proves  (for  the  case  Wj  >  0,  but  the  proof  holds  also  for  unrestricted  w^) 
that,  given  a  feasible  solution  to  SMEF',  of  value  z,  the  corresponding  solution  for 
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S{SMEF')  (obtained  by  consecutively  scheduling  Tji,...,  Tj^.^  in  time  interval  {Cj  — 
Pj,Cj\)  has  value  z  such  that  2  =  z  +  CBRK  where 

n  kU)  *(;) 

=  E  F..-  (6) 

j=i  1=1  /i=i+i 

Let  i*  be  the  optimal  solution  value  of  S{SM EF'),  then  z’  <  z,  hence 

n 

+  ^  +  CBRK  (7) 

>=i 

is  a  valid  lower  bound  on  the  value  of  z{SMEF). 

Given  a  feasible  task  splitting,  we  can  partition  the  set  of  pieces  of  S(SMEF')  into 
T"*"  =  {Tj.  :  Wj.  >  0}  and  T~  =  {Tj,  :  Wj.  <  0}.  Let  n'*'  and  n~  be  the  cardinalities  of 

these  two  subsets,  were  obviously  we  have  n'"  +  n“  =  Let  us  also  rename  the 

pieces  in  such  a  way  that 


T-  =  T-  =  {Tr,T,- . r-.} 

(with  dj,  dj ,  pj ,  pj ,  wJ  renamed  accordingly)  and  that 


dj<dj^,.  (8) 

Definition  1  Given  set  T~,  a  block  is  a  set  Bi  =  {Tj",  T~^i,  of  consecutive 

pieces  (ordered  according  to  (8)),  whose  total  processing  time  is  not  grater  than  the  time 
interval  between  the  deadline  of  and  the  deadline  ofT^..  Let  s,  =  <4,.  —  ]C>'=o,  P>‘ 

the  associated  block  interval  is  BI,  =  (s,,  db,]. 


Let  us  define 


T  =  min{t  :  ^  pJ  <  3,  for  each  i  :  df,.  >  t},  (9) 

^=1  T--.d-<, 

and  note  that  r  must  be  the  completion  time  of  a  piece  belonging  to  7'*'.  Then  we  can 
divide  problem  S{SMEF')  in  two  subproblems: 

Pa:  S{SMEF')  for  the  pieces  in  Ta  =  U  {Tf  :  dj  <  r}, 

Pb'-  S{SMEF')  for  the  pieces  in  Tb  =  {T~  :  dJ  >  r}, 

and  observe  that  Y^nePA  P>  ~ 

Without  loss  of  generality,  let  us  assume  from  now  that  pj.  =  1  for  each  ji. 

Theorem  1  The  separate  optimal  solutions  to  Pa  and  Pg  do  not  overlap  and  produce 
the  optimal  solution  to  S{SM EF'). 


138 


Proof:  the  pieces  in  Ta  must  obviously  be  scheduled  before  r.  Observe  that  in  any 
optimal  solution  to  S{SMEF‘)  the  pieces  in  Tb  must  be  scheduled  after  r.  Assume 
indeed  that  a  unary  piece  T~  €  Tb  is  scheduled  before  r:  by  definition  of  r,  a  piece 
Tj^  €  Ta  must  be  scheduled  after  r,  so  interchanging  the  two  units  the  solution  would 
improve.  Hence  the  thesis,  since  scheduling  any  piece  of  T'*'  after  r  would  leave  a  useless 
idle  time  before  r.  □ 

Problem  Pb  it  is  equivalent  to  wth  tasks  allowed  to  be  splitted  as 

described  above,  and  Vj  =  — tuy,  ry  =  maxib{dfc}  —  dj.  This  problem  is  exactly  solved  by 
the  algorithm  proposed  by  Belouadah,  Posner  and  Potts  [3]. 

Theorem  2  In  the  optimal  solution  to  Pa  any  unit  T~  E  Ta  is  scheduled  in  the  block 
interval  associated  with  the  block  containing  T~ . 

Proof:  we  prove  the  theorem  by  absurd.  Let  B,  be  the  rightmost  block  not  containing 
a  unit  T~  belonging  to  the  block,  and  observe  that  such  T~  must  be  scheduled  at  a  time 
instant  preceding  H/,.  Since  in  any  optimal  solution,  no  idle  time  can  exist  between  0 
and  r,  H,  must  contain  at  least  one  unit  of  T'*’  (indeed  no  unit  of  T~  could  come  from 
a  block  on  its  right,  by  definition  of  Bi,  nor  from  a  block  on  its  left,  since  this  would 
violate  the  deadline):  let  T'*'  denote  the  rightmost  such  unit,  scheduled  at  t  €  Bli.  Find 
the  rightmost  unit  T~,  scheduled  at  a  time  instant  preceding  i,  which  can  be  scheduled 
in  i  (note  that  such  unit  must  exist,  since,  by  definition,  a  block  interval  can  be  filled 
by  units  of  T~  with  no  idle  time).  Interchanging  T^  and  T",  we  would  improve  the 
solution,  a  contradiction.  Further  observe  that,  if  T~  ^T~,  the  process  can  be  iterated 
until  T~  is  moved  to  H/,.  □ 

Corollary  1  Problem  Pa  decomposes  into:  (i)  the  problem  P^  of  optimally  scheduling 
the  pieces  belonging  to  T~  of  each  block  of  Pa  in  the  associated  block  interval;  and  (ii)  the 
problem  P^  of  optimally  scheduling  the  pieces  belonging  to  T'^  in  intervals  (0,r]\U,{H/,  : 
<k,  <  t). 

Problem  can  be  optimally  solved  applying  the  algorithm  of  Belouadah,  Posner 
and  Potts  [3]  to  the  tasks  Tj  E  TR  such  that  the  corresponding  pieces  are  in  Ta-  Problem 
Pa  can  be  optimally  solved  applying  the  Posner  algorithm  to  sm  instance  defined  by  the 
tasks  Ty  E  Ti,  plus  a  number  of  dummy  tasks  T~,  one  for  each  block  H,  with  ds,  <  r, 
having  deadlines  d(,,  ,  processing  times  (da,  —  s,)  and  weights  — e  (with  e  >  0).  Observe 
that  such  algorithm  would  schedule  each  dummy  task  exactly  at  s,  without  splitting  it. 

Hence,  in  order  to  compute  lower  bound  Lx,  we  should  derive  Horn  the  original 
problem  the  three  problems  P^,  P^  and  Pb,  and  separately  solve  them.  However,  we 
can  obtain  a  unique  O(nlogn)  algorithm  that  determines  and  solves  these  problems  at 
a  time,  by  modifying  the  Posner  algorithm.  The  Posner  algorithm  for  \\dj\^v)jCj  with 
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task  splitting  starts  with  a  time  instant  t  =  p>,  and  schedules,  by  decreasing  time 

instants,  tasks  or  pieces  in  [0,<].  This  algorithm,  which  works  also  for  negative  weights, 
can  directly  solve  both  and  PJ  if  applied  to  Pa  with  starting  time  t  =  r.  To  solve  Pb, 
it  is  sufficient  to  observe  that  the  algorithm  proposed  by  Belouadah,  Posner  and  Potts 
for  the  problem  obtained  from  Pg  b^ replacing  Vj  with  —Wj  and  rj  with  maxjb{(ffc}  —  dj, 
is  equivalent  to  iteratively  apply  the  Posner  algorithm  to  the  tasks  of  each  block  P/,, 
starting  from  the  rightmost  one.  When  a  block  is  completely  scheduled,  if  the  sum  of 
the  processing  times  of  the  unscheduled  tasks  is  greater  than  the  ending  time  of  the  next 
block,  then  this  is  the  first  block  of  Pa,  and  problem  Pb  is  optimally  solved. 

3  Approximation  algorithm 

In  this  section  we  introduce  an  approximation  algorithm  for  SMEF  based  on  lower 
bound  L\.  The  algorithm  determines  a  feasible  schedule  for  problems  SMEF,  start¬ 
ing  from  the  optimal  solutions  of  problem  S{SMEF').  Given  the  optimal  solution 
to  S{SMEF'],  and  observing  that  this  can  be  infeasible  for  SMEF  only  because  of 
the  splitted  tasks,  we  can  easily  obtain  a  feasible  sequence  as  follows.  We  start  with 
t  =  max,{db,  }  and  proceed  by  decreasing  completion  times  until  we  encounter  a  piece 
Ty,  obtained  by  splitting  a  task  (or  a  piece)  Tj  into  T,,  and  Tj^,  with  processing  times 
Pj,  and  respectively,  scheduled  at  time  instants  t  =  ta  and  with  fa  >  fs  +  We 
can  eliminate  this  infeasibility  in  three  possible  ways; 

a)  scheduling  Tu  at  time  instant  ta  —  pj^  and  shifting  left,  of  pj^  time  units,  all  the 
tasks  currently  scheduled  between  Tj,,  and  Tj^] 

b)  scheduling  Tj,  at  time  instant  tf,—pj,  and  shifting  left  the  necessary  tasks  preceding 
Tj^,  until  an  idle  time  interval  of  length  at  least  pj,  is  created,  if  the  corresponding 
schedule  is  feasible; 

c)  scheduling  Tj,  at  time  instant  4  +  and  shifting  right  the  tasks  between  Tj,  and 
Tj,,  of  Pj,  time  units,  if  the  corresponding  schedule  is  feasible. 

Whenever  a  piece  is  encountered,  the  algorithm  evaluates  all  these  alternatives  and 
selects  the  one  producing  the  minimum  objective  function  increase. 

The  final  approximate  solution  to  SMEF  is  then  obtained  by  optimally  inserting  idle 
times.  This  can  be  done  through  the  0(n)  procedure  described  in  Bard,  Venkatraman 
and  Feo  [2]. 
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4  Computational  Results 

We  have  coded  in  C  language  the  lower  bound  Li  described  in  Section  2,  the  GRASP 
heuristic  described  in  Feo,  Venkatraman  and  Bard  [4]  and  the  approximation  algorithm 
of  Section  3. 

We  executed  computational  experiments  on  a  PC  486  with  a  33  MHz  clock,  by 
considering  problems  as  those  described  in  Feo,  Venkatraman  and  Bard  [4]. 

For  each  task  Tj  the  associated  values  of  pj  are  uniformly  random  in  range  [1, 10]. 
For  each  value  of  n,  (n  =  10,15,20,25,30),  three  classes  of  random  test  problems  are 
defined: 

I)  aj  <  Pj  for  approximately  50%  of  the  tasks; 

II)  a,  <  Pj  for  approximately  66%  of  the  tasks; 

III)  aj  <  Pj  for  approximately  33%  of  the  tasks; 

where  both  aj,pj  €  [1, 10]. 

The  deadline  of  each  task  Tj  is  uniformly  random  in  range  \P~  P>»  P*  52>=i  P>]> 

with  the  following  (/3-,/3+)  pairs:  (0.75,  1,75),  (0.25,  1.75),  (0.75,  1.75),  (0.50,  2.50), 
(0.25,  1.25),  (0,  1.25),  and  (0,  1). 

For  each  class,  for  each  value  of  n  and  for  each  pair  (/S", /?■*■)  ten  feasible  problem 
were  generated,  giving  a  total  of  350  instances. 

The  computational  experiments  have  proved  that  the  algorithm  of  Section  3  produces 
solutions  better  than  those  produced  by  GRASP.,  with  running  times  which  are  up  to 
two  orders  of  magnitude  smaller. 
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Combining  Genetic  and  Local  Search  for  Solving  the  Job  Shop 

Scheduling  Problem 

Ulrich  Dorndorf*  Erwin  Pesch* 


Abstract 

This  paper  describes  a  genetic  algorithm  that  uses  a  local  search  based  improvement 
operator  for  solving  the  job  shop  scheduling  problem.  The  genetic  algorithm  serves  as 
a  meta-heuristic  that  guides  a  procedure  for  building  starting  solutions  which  are  then 
improved  by  local  search.  Initial  computational  results  are  encouraging;  The  algorithm 
iias  solved  the  famous  10  x  10  problem  instance  formulated  by  Fisher  and  Thompson 
in  1963  which  has  defied  solution  for  over  20  years. 

1  Introduction 

The  minimum  makespan  problem  of  job  shop  scheduling  (JSP)  is  a  classical  combinatorial  opti¬ 
mization  problem  that  has  received  considerable  attention  in  the  operations  research  literature;  in 
the  recent  years,  exact  algorithms  (6,  3.  5]  and  tailored  approximation  methods  [2]  for  the  JSP  have 
been  significantly  progressed.  It  is  well  known  that  the  JSP  is  NP-hard  [23]  and  belongs  to  the 
most  intractable  problems  considered.  The  problem  is  thus  a  good  test  for  evaluating  the  power 
of  generally  applicable  approximation  techniques  [Ij.  The  algorithm  described  here  combines  two 
such  techniques,  genetic  and  local  search.  The  idea  of  using  problem  specific  information  in  form  of 
local  search  within  the  framework  of  a  genetic  algorithm  has  been  suggested  before  in  a  number  of 
publications,  see  for  instance  [18,  26,  15,  27,  1,  21,  33,  9).  This  paper  focusses  on  the  combination 
of  relatively  simple  building  blocks  rather  than  on  fine-tuning  the  inidividual  parts;  for  instance, 
more  intricate  local  search  neighbourhood  structures  than  the  one  employed  here  are  known  [22,  8). 

The  remainder  of  this  paper  is  organised  as  follows.  After  a  short  introduction  to  the  JSP 
in  the  next  section,  section  3  presents  a  simple  variable  depth  search  improvement  heuristic;  we 
assume  ti\al  the  reader  is  familiar  with  the  concepts  of  local  search.  Section  4  describes  the 
genetic  framework  in  which  this  procedure  operates.  We  conclude  with  a  description  of  initial 
computational  results. 

2  The  Job  Shop  Scheduling  Problem 

A  job  shop  consists  of  a  set  M  of  m  different  machines  that  perform  operations  on  a  set  J  of 
jobs.  Each  job  has  a  specified  processing  order  through  the  machines,  i.e.  it  is  an  ordered  list 
of  operations  from  set  V''  =  (1, . .  .,n}.  An  operation  is  characterized  by  the  machine  it  requires 
and  by  its  processing  time.  Operations  cannot  be  interrupted  (non-preemption),  each  machine  can 
handle  only  one  job  at  a  time,  and  each  job  can  only  be  performed  on  one  machine  at  a  time. 
The  problem  is  to  find  operation  sequences  on  the  machines  which  minimize  the  makespan,  the 
maximum  of  the  completion  times  of  all  operations. 
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An  illuminating  description  of  the  problem  is  the  disjunctive  graph  model  introduced  by  Roy 
and  Sussman  [31].  In  the  edge- weighted  graph,  there  is  a  vertex  for  every  operation  i  6  V  and  two 
dummy  vertices  0  and  n-t- 1  representing  the  start  and  end  of  a  schedule.  For  every  two  consecutive 
operations  of  the  same  job  there  is  a  directed  arc  from  the  arc  set  A  between  the  corresponding 
vertices;  the  start  and  end  vertices  0  and  n  -t-  1  are  considered  to  be  the  first  and  I^k5t  operation, 
respectively,  of  every  job. 

For  each  machine  k  £  M  the  edge  set  Ek  contains  all  pairs  {«,  j}  of  operations  to  be  performed 
on  k.  Because  these  operations  cannot  overlap,  an  orientation  of  the  disjunctive  edges  in  £*  must 
be  chosen;  an  operation  i  either  has  to  be  performed  before  j  (choose  the  orientation  (»,i))  or  after 
j  (choose  {j,  i)).  A  solution  to  the  JSP  (a  scheduie)  can  be  seen  cis  an  orientation  of  ail  disjunctive 
edges  in  £  =  {£'i,...,£m}  such  that  the  resulting  Graph  G{V,A,E)  is  acyclic,  i.e.  there  are  no 
precedence  conflicts  between  operations. 

Each  arc  or  oriented  disjunctive  edge  (i,  j)  in  G  is  labeled  with  a  weight  corresponding  to  the 
processing  time  of  the  operation  (vertex)  i  from  which  the  atc/edge  starts.  The  earliest  starting 
time  ii  of  an  operation  t  in  a  schedule  is  equal  to  the  length  of  the  maximum  weight  or  longest 
path  in  G  from  the  start  node  0  to  vertex  i;  the  makespan  of  the  schedule  is  eoual  to  the  length  of 
the  maximimum  weight  or  critical  path  from  start  node  0  to  end  node  n  +  1 

3  A  Local  Search  Procedure  for  the  JSP 

Most  neighbourhood  structures  that  have  been  employed  in  local  search  algorithms  for  the  JSP  can 
be  considered  to  be  based  on  an  idea  used  in  one  of  the  first  exact  solution  methods  for  the  problem 
due  to  Balas  [4].  His  implicit  enumeration  algorithm  makes  use  of  the  fact  that  m  every  schedule 
with  a  makespan  shorter  than  the  one  of  the  current  schedule,  at  least  one  of  the  disjunctive  edges 
on  the  critical  path  of  the  current  solution  graph  must  be  reversed.  Reversing  an  edge  on  the 
critical  path  of  a  directed  graph  always  yields  an  acyclic  graph  [4],  in  other  words  a  new  feasible 
solution  without  precedence  conflicts  between  operations.  These  observations  suggest  the  following 
neighbourhood  structure: 

The  neighbourhood  N{x)  of  a  solution  x  characterized  by  tne  solution  graph  Gi  is  the 
set  of  solutions  y  with  a  solution  graph  Gy  that  can  be  obtained  from  Gx  by  reversing 
the  orientation  of  a  disjunctive  edge  (i,j)  on  the  critical  path  of  G-,  i.e.  by  replacing 
(«,j)  with  (i,«). 

Reversing  the  edge  («, ;)  means  changing  the  order  in  which  i  and  j  are  processed  on  a  machine. 
This  neighbourhood  is  connective  (4,  22):  It  is  possible  to  transform  an  arbitrary  .'■'olution  into 
every  other  solution,  including  the  optimal  one,  by  going  through  a  sequence  of  moves  in  the 
neighbourhood,  in  other  words  by  iteratively  replacing  a  current  solution  x  with  one  of  its  neigh¬ 
bours  y  €  fV(i). 

In  order  to  use  the  neighbourhood  in  a  local  search  algorithm,  a  gain  must  be  associated  with 
every  move.  The  gain  from  reversing  an  edge  (t,j)  can  be  estimated  baseu  on  considerations 

about  the  minimal  length  of  the  critical  path  of  the  resulting  graph  (finding  the  exact  ga>n  of  z. 
move  would  generally  involve  a  longest  path  calculation).  The  gain  of  a  move  can  be  negative, 
thus  leading  to  a  deterioration  of  the  ob  jective  function.  For  details,  we  refer  to  (4]. 

The  simple  neighbourhood  structure  described  above  has  been  extended  by  Matsuo  et  al.  [25] 
and  Dell’Amico  and  Trubian  [8],  see  also  (32,  1,  22]. 

In  the  remainder  of  this  section,  we  present  a  local  search  procedure  that  uses  the  neighbourhood 
defined  above.  The  algorithm  is  based  on  a  technique  described  by  Kernighan  and  Lin  [20,  24], 
which  has  later  been  named  ‘variable  depth  search’  by  Papadimitriou  and  Steiglitz  [29].  The 
method  can  be  seen  as  a  special  case  of  a  more  general  approach  introduced  by  Glover  [14].  The 
basic  idea  is  similar  to  the  one  used  in  tabu  search  [12,  13],  the  main  difference  being  that  the  list 
of  forbidden  (tabu)  moves  grows  dynamically  during  a  variable  depth  search  iteration  and  is  reset 
at  the  beginning  of  the  next  iteration. 


144 


Figure  1:  A  variablt  depth  search  algortihtn 

start  with  an  initial  solution  z*; 
repeat 

T  ;=  0;  {T  is  the  tabu  list} 

d  :=  0;  {d  is  the  current.search  depth} 

do 

d.-d+l, 

find  the  best  move,  i.e.  the  disjunctive  edge  for  which  g(i',j')  = 

max  e  E  —  T);  {note  that  g(i',j*)  can  be  negative} 

make  this  move,  i.e.  reverse  the  edge  (t*,  j*),  thus  obtaining  the  solution  zj  at 
search  depth  d; 

T.=  r  +  {0-.O}; 

while  |r|  ^  I 

let  d*  denote  the  search  depth  at  which  the  best  solution  Xd-  with  f{xd‘)  = 
min  {f  [xd)\0  <  d  <  jf^l}  has  been  found; 

if  d*  >  0  then 
•c*  -  Xd  ; 
until  d*  =  0; 


The  algorithm  is  outlined  in  figure  1;  /(z)  is  the  objective  function.  Beginning  with  a  starting 
solution  z***’ ,  the  procedure  looks  ahead  for  a  certain  number  of  moves  and  then  sets  the  starting 
solution  z*  '’  for  the  next  iteration  to  the  best  solution  found  in  the  look-ahead  phase  at  depth  d*. 
These  steps  are  repeated  as  long  as  an  improvement  is  possible.  When  the  maximal  look-ahead 
depth  where  the  length  jT|  of  the  tabu  list  equals  the  cardinality'  \E\  of  the  edge  set  is  reached, 
every  disjunctive  edge  has  been  reversed  once.  The  step  leading  from  a  starting  solution  z*'**  in 
iteration  it  to  the  starting  solution  z'***'*  in  the  next  iteration  consists  of  a  varying  number  d’  of 
moves  in  the  neighbourhood,  hence  the  name  variable  depth  search.  At  the  inner  level  of  the  ‘do 
while’  loop,  the  algorithm  can  escape  local  optima  by  allowing  moves  with  negative  gain.  Cycling 
is  avoided  via  the  dynamically  growing  tabu  list  T.  At  the  outer  level  of  the  'repeat  until’  loop, 
the  procedure  stops  as  soon  as  an  iteration  without  improvement  occurs. 

As  an  extension  of  the  algorithm,  the  outer  level  (‘repeat  until’)  could  be  embedded  in  yet 
another  control  loop  (not  shown  here)  and  use  a  search  strategy  similar  to  the  inner  level,  thus 
leading  to  a  multi-level  search  algorithm  [14]. 

4  A  Genetic  Algorithm  for  the  JSP 

Genetic  algorithms  (GAs)  are  general  search  strategies  and  optimization  methods  motivated  by 
the  theory  of  evolution;  they  date  back  to  the  early  work  of  Holland  [19]  and  Rechenberg  [30],  see 
also  [16].  Simply  speaking,  a  GA  aims  at  producing  near-optimal  solutions  by  letting  a  set  (popu¬ 
lation)  of  random  solutions  (individuab)  undergo  a  sequence  of  unary  and  binary  transformations 
governed  by  a  selection  scheme  biased  towards  high-quality  solutions.  The  solutions  manipulated 
by  a  GA  are  usually  represented  as  binary  strings,  e.g.  a  binary  number  or  a  vector  of  such  numbers. 
The  transformations  are  applied  to  the  population  by  three  simple  operators:  selection,  mutation, 
and  crossover.  The  effect  of  the  operators  is  that  implicitely  good  properties  are  identified  and 
combined  into  new  individuals  of  a  new  population  which  hopefully  has  the  property  that  the 
best  solution  and  the  average  value  of  the  solutions  are  better  than  in  previous  populations.  This 
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Figure  2:  A  genetic  algorithm 

t  :=  0;  {<  is  the  generation  counter} 

initialize  P(ty,  {P(0  is  the  population  in  generation  t) 
evaluate  P(t); 

while  the  stopping  criteria  are  not  satisfied  do 
begin 

t  :=t  +  1; 

select  P(t)  from  P{t  -  1); 
recombine  P{t)\ 
evaluate  P(<); 
end; 


process  is  repeated  until  some  stopping  criteria  are  met.  Figure  2  shows  the  outline  of  a  GA. 

In  the  selection  step,  a  copy  of  an  old  individual  is  produced  with  a  probability  proportional  to 
its  fitness  value,  i.e.  better  strings  probably  get  more  copies.  Selection  can  be  realized  in  a  number 
of  ways;  one  could  adopt  the  scenario  of  Goldberg  [16]  or  use  deterministic  ranking.  Besides, 
It  matters  whether  the  newly  recombined  oOspring  compete  with  the  parent  solutions  or  simply 
replace  them.  Recombination  consists  of  crossover  and  mutation.  In  order  to  apply  the  crossover 
operator,  the  population  is  randomly  partitioned  into  pairs.  Next,  for  each  pair,  crossover  is  applied 
with  a  certain  probability  by  randomly  choosing  a  position  in  the  string  and  exchanging  the  tails 
(defined  as  the  substring  starting  at  the  chosen  position)  of  the  two  strings.  The  mutation  operator 
which  makes  random  changes  to  single  elements  of  a  string  only  plays  a  secondary  role;  its  main 
purpose  is  to  maintain  diversity  in  the  population. 

Compared  to  standard  heuristics,  for  instance  for  the  traveling  salesman  problem  “genetic  algo¬ 
rithms  are  not  well  suited  for  fine-tuning  structures  which  are  very  close  to  optimal  solutions”  [18]. 
However,  it  is  often  easy  to  extend  a  GA  to  incorportate  (local  search)  improvement  operators 
in  the  evaluation  step.  The  resulting  algorithm  has  been  called  genetic  local  search  heuristic;  in 
case  of  the  traveling  salesman  problem  we  refer  to  the  papers  of  Older  et  al.  [33]  and  Kolen  and 
Pesch  [21]. 

In  order  to  apply  a  GA  to  an  optimization  problem,  solutions  must  be  encoded  in  a  format  that 
can  be  manipulated  by  the  GA.  The  traditional  GA  based  on  a  binary  string  representation  of  a 
solution  is  often  unsuitable  for  combinatorial  optimization  problems  because  it  is  very  difficult  to 
represent  a  solution  in  such  a  way  that  the  substrings  have  a  meaningful  interpretation.  Choosing 
a  more  natural  representation,  however,  involves  more  intricate  recombination  operators  to  ensure 
that  the  offspring  is  feasible;  for  an  example  see  the  crossover  operators  developed  for  the  JSP  by 
Aarts  et  al.  [1]  and  Nakano  and  Yamada  [28]. 

The  underlying  idea  of  the  GA  described  in  this  paper  is  to  use  the  genetic  information  to  guide 
a  heuristic  which  finds  a  starting  solution  for  the  JSP.  The  GA  thus  serves  as  a  meta-heuristic  which 
produces  a  sequence  of  decision  rules  that  direct  another  algorithm.  The  output  of  this  algorithm 
can  then  be  improved  by  a  local  search  procedure,  and  the  improved  solution  is  finally  inserted 
into  the  GA  population  again.  Using  the  strings  of  a  GA  to  guide  a  scheduling  heuristic  has  first 
been  suggested  by  Davis  [7].  Applications  of  a  GA  to  the  JSP  have  been  described  in  [28,  9,  34]. 

Before  we  take  a  closer  look  at  the  G  A  itself,  let  us  briefly  introduce  the  algorithm  of  Giffler  and 
Thompson  [11],  which  can  be  considered  as  a  common  basis  of  all  priority  rule  based  heuristics 
for  the  JSP.  The  procedure  can  generate  all  active  and  hence  also  all  optimal  schedules.  The 
algorithm,  which  is  outlined  in  figure  3,  iteratively  assigns  operations  from  the  set  Q  of  unscheduled 
operations  to  machines.  In  the  description  in  figure  3,  n  and  Ci  denote  the  earliest  possible  start 
and  completion  time,  respectively,  of  operation  i. 

In  the  first  step  of  each  iteration,  the  machine  on  which  the  next  operation  has  to  be  scheduled 
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Figure  3;  Tkt  algorithm  of  Giffler  and  Thompson 


Q:={1,...,N}; 

repeat 

among  all  unscheduled  operations  in  Q  let  j'  be  the  one  with  smallest  com¬ 
pletion  time,  i.e.  Cj«  =  min  €  Q};  let  m*  denote  the  machine  j*  has  to  be 
processed  on; 

raindomly  choose  an  operation  t  from  the  conflict  set  C  =  {j  £ 
Q|j  has  to  be  processed  on  machine  m*  and  rj  <  q>}; 

Q  :=  Q  —  {i};  modify  tj  and  Cj  for  all  operations  j  £  Q, 

until  Q  =  0; 


is  chosen  in  such  a  way,  that  only  active  schedules  will  be  generated  (see  [11]).  In  the  second  step, 
conflicts,  i.e.  operations  competing  for  the  same  machine,  are  resolved  randomly.  In  priority  rule 
based  heuristics,  an  operation  from  the  conflict  set  C  is  selected  according  to  a  priority  rule  rather 
than  randomly,  for  instance  “choose  the  operation  with  the  smallest  processing  time” . 

Using  the  Giffler/Thompson  algorithm  within  the  framework  of  a  GA  is  straightforward.  The 
random  choice  of  an  operation  from  the  conflict  set  can  be  replaced  with  a  choice  based  on  a 
decision  rule,  where  either  the  rule  itself  or  the  information  used  within  the  rule  is  supplied  by  a 
GA.  For  example,  as  described  in  [9],  a  GA  can  manipulate  strings  of  priority  rules  that  are  then 
evaluated  by  using  them  in  the  iterations  of  the  Giffler/Thompson  algorithm. 

Here,  we  let  the  GA  manipulate  the  information  to  be  used  in  a  decision  rule.  An  individual 
member  of  the  population  corresponds  to  a  job  shop  schedule;  it  is  a  string  of  n  entries,  where 
n  is  the  number  of  operations  in  the  problem  instance.  An  entry  i  represents  the  starting  time 
ti  of  operation  »  in  the  schedule.  Because  the  vector  of  starting  times  can  easily  be  stored  in  the 
traditional  form  of  a  binary  string,  the  standard  crossover  and  mutation  operators  ran  be  applied. 
A  newly  recombined  string  is  evaluated  by  using  it  as  input  for  guiding  the  Giffler/Thompson 
algorithm.  Instead  of  randomly  picking  an  operation  from  the  conflict  set  C,  the  choice  is  based 
on  the  string  information  that  is  used  in  the  following  'earliest  starting  time’  rule; 

Choose  the  operation  i*  in  the  conflict  set  C  for  which  ti>  =  min  {fi|i  €  C}. 

Yamada  and  Nakano  [34]  have  independently  described  a  crossover  operator  that  is  based  on  the 
same  genetic  string  representation.  During  crossover,  the  schedules  of  the  individuals  to  be  crossed 
are  used  to  guide  the  Giffler/Thompson  algorithm;  the  random  choice  of  an  operation  from  the 
conflict  set  is  replaced  by  the  following  decision  sequence: 

1.  Apply  mutation  with  a  small  probability  by  randomly  choosing  an  operation  from  the  conflict 
set. 

2.  If  there  was  no  mutation  then  randomly  (with  equal  probability)  select  one  of  the  two  parents 
to  be  crossed  and  choose  the  operation  i*  in  the  conflict  set  C  for  which  t,.  =  min  {<,ji  €  C] , 
where  ti  denotes  the  starting  time  of  operation  i  in  the  selected  parent’s  schedule. 

5  Computational  Results 

The  GA  with  local  variable  depth  search  has  been  implemented  in  C  and  tested  on  a  Sun  SPARC- 
station  IPX.  We  have  used  Grefenstette’s  general  purpose  genetic  search  system  GENESIS  [17]  for 
the  GA  part  of  our  algorithm.  Limited  initial  tests  have  been  performed  using  the  three  famous 
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problem  instances  introduced  by  Fisher  and  Thompson  in  1963  [10]  which  have  since  then  served 
as  a  test  for  almost  every  algorithm  for  the  JSP.  We  have  tested  both  the  standard  crossover 
where  the  resulting  individual  string  serves  for  guiding  the  Giffler/Thompson  algorithm  and  the 
Giffler/Thompson  crossover  of  Yamada  and  Nakano;  both  versions  of  the  algorithm  have  given 
similar  results.  The  algorithm  has  been  run  five  times  on  each  problem  instance,  and  all  instances 
have  been  solved  to  optimality  within  a  GPU  time  of  ten  minutes  for  a  single  run.  While  the  6x6 
problem  aind  the  5  x  20  problem  are  relatively  easy,  it  is  quite  remarkable  that  the  algorithm  has 
always  solved  the  notoriously  difficult  10  x  10  instance. 

In  our  tests,  we  have  used  the  following  GA  parameters:  a  crossover  rate  of  0.8,  no  mutation,  a 
generation  gap  of  1  and  a  window  size  of  5  (see  [17]),  an  elitist  strategy,  where  the  best  individual 
of  a  generation  always  survives  reproduction,  and  an  improvement  probability  of  0.2,  meaning  that 
on  average  20%  of  the  newly  recombined  individuals  are  improved  by  the  local  search  procedure; 
this  parameter  has  been  selected  after  a  few  experiments  and  it  seems  possible  that  it  can  be 
improved.  It  is  likely  that  the  look-ahead  depth  jEl,  the  cardinality  of  the  edge  set,  used  in  the 
variable  depth  search  could  be  optimized;  in  our  experiments,  the  optimal  depth  d*  has  usually 
been  reached  after  reversing  only  a  small  fraction  of  the  total  number  of  disjuncive  edges.  Since 
the  control  parameters  have  not  been  fine-tuned,  we  suspect  that  the  efficiency  of  our  algorithm 
could  still  be  increased  by  the  ‘tender  loving  care  factor'. 

Because  our  intial  tests  have  been  limited  to  a  small  number  of  experiments  with  only  three 
problem  instances,  the  results  are  not  yet  very  conclusive,  so  great  care  needs  to  be  taken  when 
comparing  them  to  the  results  obtained  by  applying  other  generally  applicable  approximation 
techniques  to  the  JSP  sis  described  in  [25,  32,  22,  1,  8,  28,  34],  We  would  just  like  to  remark 
that  the  results  and  running  times  indicate  that  our  algorithm  is  at  least  competitive.  When 
compared  to  modern  exact  methods  and  tailored  approximation  methods  for  the  JSP  [2,  6,  3,  5], 
the  running  times  of  the  algorithm  seem  relatively  high.  However,  these  methods  are  substantially 
more  involved  than  the  algorithm  described  here,  and  extending  them  to  modified  versions  of  the 
problem  is  not  easy. 

6  Conclusions 

We  have  presented  a  genetic  algorithm  that  guides  the  Giffler/Thompson  heuristic  for  building 
active  schedules  which  are  then  improved  by  a  variable  depth  search  procedure.  The  algorithm 
which  is  comprised  of  quite  simple  building  blocks  has  solved  the  notoriously  difficult  10  x  10 
problem  instance  of  Fisher  and  Thompson  to  optimality. 

The  work  described  in  this  paper  will  be  extended  in  several  directions.  Firstly,  more  conclusive 
computational  results  will  be  produced  by  applying  the  algorithm  to  a  larger  suite  of  standard  test 
problems  and  by  comparing  its  results  to  those  obtained  by  applying  the  individual  components 
separately.  Secondly,  more  sophisticated  search  neighbourhood  structures  as  described  in  [8]  might 
be  used,  and  thirdly,  the  variable  depth  search  technique  could  be  replaced  with  tabu  search  as 
described  in  [12,  13,  8]  for  comparing  the  two  methods. 
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1.  Introduction. 

Nonlinear  Programming  (NLP)  algorithms  can  be  classified 
into  algorithms  that  generate  a  seguence  of  feasible  points  and 
algorithms  where  the  intermediate  points  in  general  are  infea¬ 
sible.  The  first  class,  called  feasible  path  methods,  can  often 
be  made  very  reliable  because  they  work  with  feasible  points. 
However ,  they  require  a  method  for  generating  an  initial  feasible 
point. 

This  paper  describes  a  new  algorithm  for  finding  an  initial 
feasible  point  in  connection  with  the  Generalized  Reduced  Gra¬ 
dient  (GRG)  algorithm  (Abadie  and  Carpentier,  1969) ,  and  in 
particular  in  the  large  sparse  GRG  algorithm  CONOPT,  (Drud,  1985 
and  1992) .  The  algorithm  is  based  on  ideas  from  Crash  procedures 
in  Linear  Programming  (LP)  with  adjustments  that  take  into 
account  the  special  features  of  nonlinear  models. 

The  paper  is  organi;::ed  as  follows:  Section  2  defines  our 
problem  and  assumptions.  Section  3  summarizes  traditional  methods 
used  for  finding  an  initial  feasible  solution  in  GRG  algorithms. 
Section  4  describes  the  proposed  crash  procedure.  Section  5 
contains  some  limited  computational  experience,  and  section  6 
concludes  the  paper. 


2.  Assumptions. 

We  consider  nonlinear  programs  of  the  following  general 

form: 


subject  to 
and 


min  f(x) 
g(x)  =  b 
1  <  X  <  u 


(1) 

(2) 

(3) 


where  x  is  the  n-vector  of  decision  variables,  g  is  the  m-vector 
of  constraint  functions,  f  is  the  objective  function,  b  is  the 
m-vector  of  right  hand  sides,  and  1  and  u  are  n-vectors  of  lower 
and  upper  bounds.  Some  of  the  bound  values  may  be  minus  or  plus 
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infinity.  We  assume  that  f  and  g  are  defined  and  have  continuous 
derivatives  for  all  values  of  x  satisfying  the  bounds  (3) . 

We  assiune  that  there  are  hundreds  or  thousands  of  equations 
and  variables,  and  that  the  Jacobian  is  sparse.  In  addition,  we 
assume  that  the  individual  nonlinear  functions  and  rheir  deriva¬ 
tives  can  be  computed  independently.  This  is  a  reasonable  assump¬ 
tion  when  the  model  is  communicated  to  the  solution  algorithm  by 
a  modeling  system  such  as  the  General  Algebraic  Modeling  System, 
GAMS  (Brooke  et.al.,  1988),  currently  the  most  widely  used  input 
generator  for  CONOPT. 


3.  The  GRO  Algorithm  and  its  Phasa-1  Procedure. 

When  the  GRG  algorithm  is  described  the  problem  of  finding 
an  initial  feasible  solution  is  usually  ignored. 

Traditionally,  the  problem  of  findina  r/iirial  feasible 
solution  has  been  attacked  similar  to  the  va'-  it  done  in  Phase 
1  in  LP:  Artificial  variables  with  suitable  eourds  are  added  to 
the  infeasible  equations  to  yield  a  relaxed  but  leasibie  model. 
The  sum  of  the  artificials  is  then  .tinimited,  suhiect  the 
ec[uations  of  the  relaxed  model.  The  solution  to  this  ohase-1 
model  is  either  a  point  in  which  all  artificial  variables  are 
zero,  i.e.  a  feasible  solution  to  the  original  problem,  or  a 
strictly  positive  local  minimum  of  the  sum  of  infeasibiiities  in 
which  case  the  model  is  considered  < locally)  infeasible. 

The  computational  cost  of  this  phase-',  procedure  will  depend 
on  the  number  of  artificial  'variables  the  initial  point,  i.e. 
on  the  number  of  infeasible  equations,  and  on  the  size  of  the 
infeasibilities.  The  procedure  may  be  relatively  slow  on  models 
with  many  small  infeasibilitics  .oecause  the  removal  of  each 
artificial,  independent  on  its  initial  size,  requires  at  least 
one  iteration.  We  have  therefore  implemented  a.i  initial  Phase-1 
heuristic  in  CONOPT  to  get  around  this  problem,  '"he  heuristic 
is  summarized  in  Fig.  1. 


1.  Select  a  set  of  basic  variables  favoring  variables 
away  from  bounds. 

2.  Perform  a  Newton  step  using  these  basic  variables 

a  Use  steplength  <  1  if  a  basic  variable  otherwise 
would  exceed  a  bound 

b  Change  the  basis  of  the  critical  basic  ''^ariable 
is  at  bound 

c  If  the  iterations  do  not  converge  due  to  non- 
linearities  then 

Change  the  basis  or 

Remove  "bad"  equations  from  the  Newton  process 

3.  When  the  "good"  equations  are  feasible,  add  artifi¬ 
cials  to  the  "bad"  equations  and  minimize  the  sum 
of  artificials  using  the  standard  GRG  procedure. 


Figure  1:  The  Fhase-i  heuristic  in  CONOPT 
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W})eii  uiny  variables  have  gocxl  initial  values  the  heuristic 
will  be  abl^  ':.c  select  a  good  basis  with  nany  variables  away  from 
their  bcune^:..  ;\r.ci  a  feasible  solution  can  be  found  very  quickly, 
essenciailv  with  the  speed  of  a  Newton  procedure. 

chare  are -several  disadvantages.  If  there  are  few 
variaiJles  ^^av  from  bounds  in  the  initial  point  then  it  is 
viff  rcil  i  -  ^  t:.nd  a  good  basis  and  the  procedure  may  use  many 
.  ceracions  before  it  gives  up  and  switches  to  the 
•standLir--  aii!..a,l:;ation  of  artificials.  Even  worse,  the  procedure 
doe.-s  rev.  -  »r.;;  '^-.11  with  the  way  many  model  builders  define 
initial  •  .  mportant  or  critical  varizdsles  are  assigned 

rGasorat.'f?  while  unimportant  variables  are  left  un- 

.”craa;jles  with  "good"  values,  i.e.  those  away  from 
boana  re  .be  *■  ■ be  selected  as  basic  variables  and  changed 
durin-.;  the  .•.ita,'.i.  .tarations,  while  varieibles  with  "bad"  values, 
.■sura  ■  V'  .  t.jur.a.  ~re  kept  unchanged. 


•'o5t  rr-sma  r  j  LP  systems  have  a  so-called  crash  procedure 
chat  •-•a,'  tr  to  find  an  initial  basis.  The  purpose  is, 

wi^r.  law  r3doiu.ee  .o  find  a  point  with  few  artificial  variables 
onv%  ree  .i.-.  ns  cial  point  for  the  Phase-1  LP. 

u  good  besc“rr  .’on  of  crash  procedures  can  be  found  in 
(Gou.k.i  end  he  id,  .?3f  ■  .  one  of  the  procedures  advocated  in  this 
paper  Idiircl  or.  v^r''  simple  principles: 

1.  brier  rhe  T*-.i-ua;tia  ,  and  variables  into  almost  triangular  form. 
Salve  the  uo-iaticn-  one  by  one  in  this  order,  keeping  the 
'ariab.' sc  from  ore'/:!  ous  equations  fixed. 

3-  C-c  an  equation  is  infeasible,  solve  a  larger  subproblem 
iiv/oiving  .3ome  of  ihe  •-revious  equations.  If  still  infeasible, 
';.v.i-.'.-0'ru  :e  a  i  variable. 

-  v-c.- - .-asriy  be  generalized  to  nonlinear  equa- 
c.ioyiw.  ..i.  -  or.-.  ng  .’.  iCr.-  almost  triangular  fonn  is  independent 
or  j.rnsai:j.tv .  tci.-)  v  inveives  the  solution  of  one  equation  at  a 
t  .'n.,  _c  can  -v  be  generalized  to  nonlinear  equations  by 
subttitut.i'ir  a”  c-Ap_lcit  solution  with  an  iterative  procedure 
■..?'Tcr  VT^'.itc.n  r  iiecood,.  Step  3  involves  the  solution  of  sets 
of  vrvevai  'ma-^ions.  ”e  t:.ave  not  yet  implemented  this  step,  but 
.0  rotbed  bnred  on  newter, -s  .'xethod  can  also  be  used  here. 

"he  ordor'.rq  i.j  Gould  and  Reid's  paper  is  based  on  the  P3 
procebor 3  in  (uslleiman  and  Rarick,  1971) .  It  can  also  be  imple- 
nented  ..r.  a  slightly  different  v/ay  as  follows: 

1.  Compute  roT.'  counts  as  the  number  of  nonzero  Jacobian  elements 
in  each  rcw.  Cclucns  with  fixed  variables  are  excluded. 

2.  I*  there  are  no  rows  left.  Stop.  Otherwise,  find  RCmin,  the 
.ninimai  row  count - 

3.  RCmin  -  .■):  Select  the  row(s)  with  row  count  0,  remove  the 
rowf?:)  from  further  consideration,  and  go  to  2. 
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4.  RCmin  -  1:  Select  a  row  with  row  count  1  and  its  corresponding 
column,  remove  the  row  and  coliimn  from  the  Jacobian,  update 
row  counts,  and  go  to  2. 

5.  RCmin  >  1:  Select  a  row  with  row  count  RCmin  and  its  corres¬ 
ponding  columns,  remove  the  row  and  columns  from  the  Jacobian, 
update  row  counts ,  and  go  to  2 . 

The  order  in  which  the  rows  and  columns  are  selected  defines 
their  order  in  the  almost  triangular  form. 

The  ordering  procedure  is  not  well  defined  in  the  selection 
in  step  4  and  5.  Builders  of  LP  models  will  usually  not  provide 
initial  values  and  the  selection  is  therefore  only  based  on  the 
sparsety  structure  of  the  LP  matrix. 

The  situation  is  quite  different  for  nonlinear  models.  As 
a  result  oc  the  sequential  solution  procedure  some  variables  will 
be  kept  fixed,  while  the  remaining  variables  will  be  computed  as 
functions  of  these  fixed  variables  and  of  variedsles  computed 
earlier  in  the  sequential  process.  We  should  therefore  try  to 
order  the  variables  and  equations  such  that  fixed  variables  have 
"good"  initial  values,  while  the  equations  preferably  are  solved 
with  respect  to  variables  without  a  "good"  initial  value. 

The  freedom  in  selecting  variables  to  keep  fixed  appear  only 
when  RCmin  >  1  in  step  4  above.  In  the  following  we  will  discuss 
how  we  determine  whether  an  initial  value  is  "good" ,  and  how  this 
influences  the  ordering.  We  will  start  with  a  small  example. 


X 

Figure  2:  The  Jacobian  Structure  of  a  Small  Model. 


Fig.  2  shows  the  structure  of  the  Jacobian  of  a  model  with 
four  variables,  three  equations,  and  RCmin  =  2.  If  any  one  of  the 
four  variedjles  is  fixed,  the  three  equations  can  be  solved  recur¬ 
sively  for  the  remaining  three  variables. 

Let  2;(Xj:)  denote  the  solution  for  Xj^  fixed  and  let  2^  be  a 
vector  of  initial  values.  Depending  on  which  of  the  four 
variables  we  fix  we  can  compute  four  initial  points:  3£(X^) , 
x(X2) ,  X(X3) ,  and  .  Note  that  some  of  these  points  may  be 
infeasible  because  of  bounds  on  the  variables.  .Also  note  that  if 
Xj^  is  fixed  then  the  initial  values  of  the  other  variables  are 
ignored,  except  as  initial  points  in  the  solution  process. 

Our  problem  is  to  select  one  of  these  four  points  without 
actually  evaluating  them  all,  i.e.  from  21  only.  A  number  of 
characteristics  of  the  different  points  may  help  us  in  the 
selection: 
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A  If  X4  has  a  "good"  initial  value  then  will  be  a  "good" 

solution. 

B  User  supplied  initial  values  are  likely  to  be  "better"  than 
default  initial  values. 

C  If  an  equation  is  feasible,  the  variables  appearing  in  it  are 
more  likely  to  have  "good"  values. 

D  If  and  Xj  are  the  'only  variables  in  an  equation  and  this 
equation  is'^feasible  then  =  x(Xj). 

He  will  in  the  following  separate  the  variables  into  two 
groups:  variables  with  default  initial  values  and  variables  with 
defined  initial  values.  The  definition  will  depend  on  how  the 
model  is  communicated  to  CONOPT.  When  the  model  comes  from  GAMS 
zero  projected  on  the  bounds  is  considered  a  default  initial 
value  and  other  values  are  defined.  Variables  with  defined  ini¬ 
tial  values  will  in  general  be  considered  "better"  that  variables 
with  default  initial  values. 

We  will  also  separate  the  equations  into  feasible  and  infea¬ 
sible  constraints.  An  initial  value  that  appears  in  a  feasible 
constraint  will  in  general  be  considered  "better"  than  an  initial 
value  that  appears  in  an  infeasible  constraint.  If  an  infeasibi¬ 
lity  can  be  repaired  by  adjusting  a  variable  with  default  initial 
value  the  default  variable  could  be  an  uninitialized  intermediate 
variable,  and  the  infeasibility  is  not  considered  to  be  bad. 

Based  on  these  considerations  we  select  a  "best"  row  with 
row  count  =  RCmin  whenever  RCmin  >  1.  The  selection  is  done  by 
giving  priorities  from  1  to  6  to  the  candidate  rows  and  selecting 
a  row  with  the  smallest  priority.  The  priorities  are  defined  as 
follows: 


1.  Feasible  with  at  least  2  defined  vari¬ 
ables.  Since  the  equation  is  feasible  the 
defined  values  seem  to  be  "good".  Select 
the  variable  with  the  largest  Jacobian  as 
basic. 


2.  Infeasible  with  1  default  variables  that 
can  be  changed  to  satisfy  the  equation. 
The  defined  values  seem  to  be  reasonable. 
Select  the  default  variable  as  basic  and 
solve  the  equation  w.r.t  this  variable. 


.  Feasible  with  1  defined  variable.  Since 
the  equation  is  feasible  the  defined  value 
seem  to  be  "good".  Select  the  defined 
variable  as  basic. 


3 
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4.  Infeasible  with  at  least  2  default  vari¬ 
ables  that  both  can  be  changed  to  satisfy 
the  equation.  The  equation  can  be  made 
feasible,  but  the  solution  depends  on 
which  variable  is  changed.  He  select  the 
variable  with  the  largest  pivot  to  be 
changed  to  minimize  the  absolute  change. 


Feasible  with  only  default  values.  Since 
all  values  are  default  the  feasibility 
seems  to  be  accidental.  Select  largest 
pivot  as  basic. 


Infeasible-  No  single  variable  can  be 
changed  to  make  the  equation  feasible. 
Select  the  variable  that  will  reduce 
infeasibility  the  most. 


Whenever  a  row  is  selected  we  try  to  make  it  feasible 
immediately.  The  updated  values  of  the  variables  are  then  used 
to  evaluate  feasibility  during  the  selection  of  the  next  row. 
This  is  in  contrast  to  the  LP  environment  where  the  logical 
ordering  is  done  before  the  solution  process  is  started. 

When  RCmin  =  l  we  must  select  a  particular  row  with  rcw 
count  one.  If  each  row  has  its  own  column  then  the  solution  is 
independent  of  the  order  in  which  the  rows  are  selected.  However, 
if  a  potential  pivot  column  intersects  more  than  one  candidate 
row  the  solution  will  depend  on  which  row  is  selected.  In  this 
case  we  try  to  minimize  the  sum  of  infeasibilities  in  the  remai¬ 
ning  rows,  and  the  row  selection  is  similar  to  the  CHUZR  proce¬ 
dure  in  Rarick's  Phase-1  procedure  for  LP. 

Many  equations  will  be  feasible  and  we  will  have  a  basic 
variable  in  most  equations  after  the  procedure  outlined  above  has 
been  used.  However,  there  may  still  be  some  equations  without  a 
basic  variable:  Ecpiations  selected  when  RCmin  =  0,  and  equations 
that  cannot  be  made  feasible. 

The  basis  can  be  completed  with  artificial  variables  and  the 
traditional  phase-1  procedure  can  be  applied  to  minimize  the  sxim 
of  the  artificials.  Alternatively,  we  may  select  the  missing 
basic  variables  from  the  variables  away  from  bounds  and  use  the 
heuristic  in  Fig.  1. 
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5.  Coaputational  Exparie&oa. 

A  set  of  tables  with  computational  comparisons  Is  available 
from  the  author.  We  will  here  summarize  some  numbers  from  a 
medium  sized  8*-perlod  ref  inary  model.  The  model  has  1793  vari- 
edales  of  which  1464  have. defined  initial  values.  The  models  1593 
equations  are  divided  onto  95  pre-triangular  equations  that  can 
be  solved  recursively  before  the  optimization  and  2  post-trian¬ 
gular  equations  that  can  be  collapsed  into  the  objective  func¬ 
tion.  Of  the  remaining  1496  equations  1199  (80%)  are  made 
feasible  with  a  basic  variable  by  the  crash  procedure  while  297 
equations  are  not  assigned  a  basic  varieJale  initially. 

The  sum  of  infeasibilities  is  initially  4844.6.  After 
solving  the  95  pre-triangular  equations  and  removing  the  2  post- 
triangular  equations  there  are  165  infeasible  equations  and  the 
sum  of  infeasibilities  is  4772.6.  The  crash  procedure  produces 
a  point  with  only  30  infeasible  equations  (a  reduction  of  82%) 
and  a  sum  of  infeasibilities  of  33.14  (a  reduction  of  99.3%). 

The  original  feasibility  heuristic  mentioned  in  Fig.  1 
including  the  following  ordinary  phase-1  procedure  required  1165 
iterations  and  582  seconds  to  find  a  feasible  solution  and  the 
overall  optimization  required  2596  iterations  and  1698  seconds. 
The  crash  procedure  followed  by  the  heuristic  required  246  itera¬ 
tions  and  125  seconds  to  find  a  feasible  solution  (78%  saving) 
and  1669  iterations  and  1139  seconds  to  find  the  optimal  solution 
(32%  saving) .  The  crash  procedure  followed  by  a  straight  minimi¬ 
zation  of  artificials  required  613  iterations  and  219  seconds  to 
find  a  feasible  solution  (62%  saving)  and  2269  iterations  and 
1591  seconds  to  find  the  optimal  solution  (6%  saving) . 

The  saving  on  other  models  vary  consideredile,  but  is  is 
positive  on  almost  all  models.  There  is  also  considerable  varia¬ 
tion  between  the  options  for  finishing  the  basis  —  the  old 
feasibility  heuristic  or  minimization  of  artificials. 

One  interesting  result  is  that  some  difficult  models  that 
CONOPT  declared  Infeasible  before  now  prove  to  be  feasible.  The 
reason  seems  to  be  that  the  crash  procedure  moves  many  variables 
with  default  Initial  values  away  from  their  bounds  or  from  zero, 
resulting  in  a  better  behaved  point  that  is  further  away  from 
any  singularities. 


6.  Conclusions. 

Although  the  computational  testing  is  still  ongoing  we  can 
already  conclude  that  the  new  crash  procedure  is  very  promising. 
Given  a  few  good  initial  values  we  will  on  most  models  be  able 
to  reduce  the  time  to  find  an  initial  feasible  solution.  The 
initial  feasible  solution  will  often  be  better  which  reduces  the 
following  optimization  time.  And  we  seem  to  be  able  to  solve  more 
difficult  models  that  could  not  be  solved  before. 

The  more  intelligent  use  of  initial  values  may  mean  that  it 
is  no  longer  necessary  to  supply  initial  values  for  many  inter¬ 
mediate  variables.  This  simplifies  model  construction  and  may 
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incourage  model  builders  to  concentrate  on  essential  initial 
values.  User  specifications  of  che  quality  of  initial  values 
could  enhance  the  procedure. 

More  work  is  still  needed,  in  particular  on: 

improving  the  selection,  of  variables  to  fix,  e.g.  based  on 
information  from  several  equations  and  on  the  influence  on 
the  objective, 

-  completing  the  basis  after  the  crash  procedure,  and 
limited  backtracking  when  equations  are  infeasible. 
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1  Introduction 

Structural  optimization  has  become  an  increasingly  important  tool  in  the  design  process,  due  to 
the  continuously  increasing  demands  on  technical  systems  and  their  components.  Because  of  an 
extended  application  of  structural  optimization  techniques  to  real,  industrial  problems,  the  portion 
of  so-called  large  scale  problems  increases  accordingly.  The  latter  problems  are  characterized  by  a 
high  demand  on  computer  resources  (storage  capacity,  calculation  time)  within  the  solution  pro¬ 
cess.  Various  decomposition  techniques  have  been  developed  in  order  to  efficiently  solve  such  large 
scale  problems  (Wis71,  Him73].  Parallel  processing  means  a  computational  decomposition  of  a  task 
onto  different  processors  or  computer  nodes,  and  therefore  it  is  also  a  very  general  decomposition 
approach.  Here,  the  solution  of  shape  optimization  problems  of  complex  shell  structures  on  a  par¬ 
allel  computer  system  will  be  presented.  As  an  application  we  have  chosen  the  shape  optimization 
of  an  automotive  wheel  with  respect  to  several  load  cases. 

2  Treatment  of  shape  optimization  problems 

The  mathematical  formulation  of  shape  optimization  problems  can  be  written  as  follows: 

F-[RHn]  =  Min{F(«*(r)l  I  H*(r)  6  C}  (1) 

G  =  {«*(?“)  €  A®  I  i/[«*(D]  =  0, 

G[R*((")1  >  0 
R*'  <  R*  <  R‘“}. 

with 

F  :  vector  of  objective  functionals, 

H,G  :  equality  and  inequality  constraint  functionals, 

R*  :  shape  functions, 

:  GAUSSian  parameters, 

R*',  R^  :  lower  and  upper  bounds  for  the  shape  functions, 

G  :  set  of  feasible  shape  functions. 
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The  shape  optimization  problems  formulated  in  ( 1 )  can  be  solved  by  means  of  indirect  and  direct 
methods.  Indirect  procedures  derive  necessary  conditions  for  the  optimal  shape  using  variational 
principles  and,  subsequently,  solve  the  resulting  differential  equations  -  generally  by  approxima¬ 
tion  methods.  When  a  direct  solution  method  is  applied,  the  shape  optimization  problem  (1)  is 
transformed  into  a  multicriteria  parameter  optimization  problem  using  approach  functions  with 
free  parameters  Eispecially  parametrical  spline  functions  known  from  the  field  of  CAD 

[Boe84]  proved  to  be  highly  efficient  for  various  applications  [Bra84].  The  obtained  multicriteria 
formulation  is  subsequently  transformed  by  means  of  preference  strategies  into  a  scalar  parameter 
optimization  problem  which  can  finally  be  solved  by  any  Mathematical  Programming  algorithm 
(MP-algorithm). 

Structural  optimization  problems  can  be  solved  by  an  optimization  procedure  following  the 
Three-Columns-Concept  (Esc9lj.  The  three  columns  are  the  optimization  algorithms,  the  struc¬ 
tural  analysis  modules  and  the  optimization  model.  All  inoduls  can  be  chosen  according  to  the 
problem  formulation.  The  direct  optimization  strategy  is  realized  in  the  design  model  (Fig.  1). 
The  approach  functions  are  chosen  problem-dependent  from  an  extensive  library. 
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Figure  1;  Optimization  loop  for  shape  optimization  problems 


3  Decomposition 

The  large-scale  problems  of  type  (1)  occuring  with  the  optimization  of  real-life  technical  systems 
require  a  high  storage  capacity  and  extensive  calculation  times.  The  efficient  solution  of  such 
problems  by  means  of  the  available  resources  (storage  capacity/computation  time)  calls  for  the 
application  of  decomposition  techniques. 

Fig.  2  shows  the  potential  decomposition  approaches  for  problems  in  structural  optimization. 
The  decomposition  methods  usually  employ  several  of  the  depicted  decomposition  approaches. 
While  model  decomposition  mainly  aims  at  reducing  the  required  computer  storage,  the  com¬ 
putational  decomposition  intends  to  reduce  the  computation  time  required  for  the  solution. 

The  term  structural  decomposition  means  partitioning  the  state  vector  u  and  the  corresponding 
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Figure  2:  Clasciiication  of  decomposition  approaches  in  the  held  of  structural  optimization 


state  equations  #  into  subvectors, 

ti,  #  =»  l«t,«2...-,ttns]. 

This  partitioning  can  be  done  according  to  several  substructures  [Bat76,  Guy65]  or  according  to 
different  disciplines.  If  the  structural  subproblems  are  coupled  with  each  other  •  this  is  in  most 
cases  true  -  a  coordination  procedure  has  to  be  used. 

Decomvosition  of  the  optimization  model  requires  the  partitioning  of  the  design  variables  x 
and/or  the  constraints  h,g  into  subvectors, 

x,h,g  =>  (xi,X2,...,*„s],  . 

After  solving  the  subprobleins  independently,  the  solutions  of  the  subsystems  must  be  coordi¬ 
nated.  The  coordination  and  the  concurrent  subsystem  optimizations  form  a  iterative  process. 
Methods  using  this  approach  are,  for  example,  the  DANTZiG-WoLFE-decomposition  [Dan60]  for 
large  linear  optimization  problems.  Multilevel- Methods  for  hierarchically  structured  systems  (e.g. 
^Wis71,  Las/O]  or  fer  non-hierarchical  systems  (e.g.  [Blo90,  Wu92]),  and  methods  based  upon  a 
substructuring  (e.g.  [Kir72,  Bre89]. 

If  a  suitable  parallel  computer  equipment  is  available,  the  computational  decomposition,  which 
means  nothing  else  than  a  parallel  or  distributed  execution  of  independent  calculation  tasks  can 
he  applied.  Within  the  process  of  optimization,  the  sensitivity  analysis  is  a  subtask  suitable  for 
parallelization  since  the  computation  of  the  partial  derivatives  with  regard  to  the  design  variables 
are  uncoupled  processes  and  can  therefore  be  parallelized.  In  many  cases,  the  parallelization  is 
carried  out  in  combination  with  a  model  decomposition  (e.g.  in  combination  with  substructuring 
[Top91,  E1S91]).  The  computations  in  the  subsystem  level  are  then  independent  of  each  other  and 
can  be  parallelized.  In  order  to  quantitatively  evaluate  the  gain  achieved  by  paralldization,  the 
so-called  speed-up  5  is  introduced: 


where  T,tq  denotes  the  required  calculation  time  on  one  processor  and  Tn  the  calculation  time  on 
n  processors. 

For  the  parallelization  a  transputer  system  consisting  of  20  transputers  (T800)  is  available  (Fig. 
3).  The  transputers  are  arranged  in  an  array  and  their  local  storage  capacities  ranges  between  1MB 
and  8MB.  The  definition  of  the  processes,  the  communication  between  them,  and  the  allocation  of 
the  processors  is  carried  out  by  means  of  the  declaration  language  CDL  (Component  Declaration 
Language)  under  the  operating  system  Helios  (NN90].  The  topology  of  the  processes  can  be 


Figure  3:  Parallel  computing  environment 


defined  independently  from  the  hardware  topology. 

In  the  present  work  a  parallel  substructure  technique  and  the  parallel  sensitivity  analysis  is 
employed  for  the  solution  of  the  shape  optimization  problems.  A  decomposition  of  the  optimization 
model  is  not  carried  out. 

a)  Optimization  using  parallel  substructuring: 

Substructuring  is  commonly  seen  as  the  classical  (static)  decomposition  technique  for  the  structural 
analysis  of  complex  components.  Reduced  subsystem  matrices  are  calculated  separately  for  each 
substructure,  and  these  matrices  are  coupled  at  the  main  system  level.  After  determining  the  state 
variables  at  the  boundaries  (coupling  nodes)  of  the  substructures,  the  local  (internal)  ones  can 
be  computed  -  again  separately  for  each  subsystem.  Fig.  4a  shows  the  flowchart  of  the  parallel 
structural  analysis  realized  here,  based  upon  the  substructuring,  where  the  computations  of  the 
main  and  of  the  subsystems  are  carried  out  on  an  own  processor  each.  Since  the  main  system 
processor  is  not  employed  during  the  subsystem  computations,  one  subsystem  is  treated  on  the 
main  processor.  The  described  procedure  is  not  limited  to  the  application  of  a  special  analysis 
program  at  the  subsystem  level,  but  this  analysis  program  must  be  able  to  create  reduced  stiffness 
matrices  and  consider  prescribed  displacements. 

The  implementation  of  the  above  procedure  on  the  transputer  system  follows  the  master-sfave- 
concept  In  this  concept  a  process  called  master  controls  and  coordinates  a  set  of  subsequent 
slave  processes.  The  program  representing  the  slave  contains  all  modules  required  for  the  various 
calculation  tasks  at  the  subsystem  level  (Fig.  4a).  Additionally,  it  possesses  a  local  database 
which  stores  -  even  for  several  structural  models  •  the  necessary  control  data.  This  guarantees  a 
minimal  data  transfer  during  optimization  because  the  updated  geometry  and  the  resulting  state 
variables  have  to  be  transfered  only.  The  master  process  contains  the  complete  optimization  loop 
including  the  routines  for  subsystem  calculations,  because  one  subsystem  is  also  analyzed  by  the 
master  process.  The  necessary  system-calls  for  the  purpose  of  communication  are  carried  out  by 
a  small  set  of  hardware-independent  modules  only,  which  reduces  the  effort  when  this  concept  is 
implemented  on  another  computer  system. 

b)  Parallel  sensitivity  analysis: 

The  sensitivity  analysis  is  a  very  time-consuming  subtask  within  the  optimization  process.  It 
requires  the  calculation  of  partial  derivatives  of  the  objectives  and  constraints  with  respect  to  the 
design  variables  (dffdxi,dg/dxi,dh/dxi).  Here,  we  approximate  them  by  finite  differences  (first 
order  differences).  The  approximation  of  the  first  derivative  of  an  arbitrary  function  F{x)  with 
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Figure  4:  Parallel  substructuring  (a)  and  parallel  sensitivity  analysis  (b) 


respect  to  z,-  using  the  forward  difference  is: 


dF(x)  ..  f(xo4- A)-f(x,) 

5*.  lAI 


/\i  =  (zo,,...,zo.  +  Ax,,...,zo„). 


As  shown  in  (3),  one  needs  as  many  functional  ev.iluations  as  defined  design  variables  n.  One 
functional  evaluation  means  one  pass  of  the  optimization  loop  (Fig.  1).  Since  the  functional 
evaluations  are  independent  from  each  other,  the  sensitivity  analysis  can  be  parallelized.  In  contrast 
to  the  decomposition  method  described  in  a),  this  method  only  reduces  the  required  computer 
time  but  not  the  required  computer  memory.  Fig.  4b  depicts  one  iteration  with  parallel  sensitivity 
analysis.  The  optimal  load  balance  of  the  processors  can  be  obtained,  if  the  condition 


(«  +  l)/(m  +  1)  €  Z 


(4) 


is  fulfilled. 

The  implementation  on  the  transputer  system  is  also  done  according  to  the  master-slave- 
concept.  In  the  master  process  all  routines  or  modules  which  are  necessary  ibr  the  entire  opti¬ 
mization  are  included.  Thus,  the  master  process  is  executable  even  without  associate  processors 
for  sensitivity  analysis.  In  contrast  to  the  master,  the  slave  process  consists  of  those  routines  which 
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are  necessary  for  the  pass  of  the  optimization  loop  (for  instance,  no  optimization  algorithm  is 
included),  and  it  has  a  local  data  base. 


4  Application:  Optimal  layout  of  an  automotive  wheel 

The  automotive  wheel  to  be  optimized  will  be  considered  as  a  branched  shell  of  revolution  (unsym- 
metrical  details  as  vent  holes  are  not  regarded).  Fig.  5  shows  the  initial  design  of  this  wheel  and 
one  of  the  considered  load  cases  (roiling  bench  test).  In  addition  to  the  load  case  rolling  bench  test, 
the  load  case  rotating  bending  test  will  be  considered  in  the  structural  model.  The  latter  load  case 
is  relevant  for  the  design  of  the  disk  and  for  the  connection  of  rim  and  disk.  The  degrees  of  freedom 


Figure  5:  Initial  design  of  the  automotive  wheel  including  a  description  of  the  load  case  rolling 
bench  test 

for  finding  the  optimal  design  are  the  thicknesses  of  rim  and  disk  (each  constant),  the  meridional 
shape  of  the  duiband  the  meridional  shape  of  the  centrepart  of  the  rim  (Fig.  5).  For  the  description 
of  the  meridional  shapes  of  the  disk  and  the  rim  the  approach  functions  "B-Spline  (Jl*  =  3)"  and 
"Coupled  circular  arc/straight  lines”  are  used.  The  thicknesses  and  10  control  points  of  the  shape 
functions  are  chosen  as  design  variables.  The  weight  is  defined  as  the  objective  function  and  stress 
and  deformation  constraints  as  well  as  shape  constraints  are  considered.  Fig.  6  shows  the  optimal 
design  of  the  automotive  wheel  for  the  given  optimization  model.  The  weight  of  the  optimal  design 
is  7.02kg  which  means  a  weight  reduction  of  more  than  30%  in  comparison  to  the  feasible  initial 
design  (10.17kg,  obtained  by  pure  sizing).  Then,  the  shape  optimization  problem  of  the  wheel  is 
solved  by  means  of  the  decomposition  methods  described  in  the  section  3. 

Method  1:  The  parallel  sensitivity  analysis  is  applied.  For  that  purpose,  0  to  12  associate  pro¬ 
cessors  will  be  used  successively.  Thus,  we  cover  the  whole  range  from  sequential  up  to  full  parallel 
sensitivity  analysis. 

Method  2;  Besides  the  parallelization  of  the  sensitivity  analysis  the  structural  model  will  decom¬ 
posed  and  analyzed  by  means  of  parallel  substructuring.  For  this  purpose,  the  wheel  is  cut  off 
at  the  branch  and  partitioned  into  three  substructures.  Using  this  method  the  needed  computer 
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Figure  6:  Optimal  design  W=7.02kg 

memory  is  reduced  in  addition  to  the  reduction  of  the  computer  time.  The  assignment  of  proces¬ 
sors  to  the  substructure  processes  (slaves)  is  fixed  while  we  use  variable  numbers  of  processors  for 
the  sensitivity  analysis  (0  to  4  associate  processors).  Thus,  for  this  method  we  use  from  3  to  15 
processors. 


Figure  7;  Speed-Up  and  efficiencies  for  method  1  and  method  2 

As  shown  in  Fig.  7a,  we  save  considerable  computational  time  with  both  methods.  For  the 
method  2,  the  saving  in  computational  time  is  less  than  for  method  1  caused  by  the  greater  portion 
of  sequential  computations  during  the  substructuring.  For  the  valuation  of  these  two  methods  how¬ 
ever,  one  has  to  take  into  account  that  method  2  saves  computation  time  and  computer  memory. 
The  efficiency  (efficiency =speed-up/number  of  used  processors)  is  a  mean  value  for  the  utilization 
of  the  processors.  Concerning  the  efficiency,  method  1  is  also  better  than  method  2.  The  non- 
monotonous  course  of  the  efficiencies  in  Fig.  7b  is  caused  by  the  violation  of  condition  (4).  (except 
for  0  or  12  associated  processors). 
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In  this  talk  we  deal  with  a  subproblem  arising  in  the  design  of  a  main-frame- 
computer.  This  problem  can  be  stated  as  follows: 

Let  be  given  a  set  N  of  items  and  a  set  M  of  modules.  Each  item  i  £  N 
has  a  weight  /,.  Similarly,  with  every  module  k  £  M  a.  capacity  Fk  is 
associated  which  represents  the  area  of  the  module.  Moreover,  there  is  given 
a  list  of  nets  Z  =  {Ti, . . .  ,7’*}.  Each  net  Tt  is  a  subset  of  the  set  of  items 
which  has  to  be  connected  via  a  wire.  Finally,  every  module  fc  6  M  hais  a  so- 
called  cut  capacity  Sk,  which  restricts  the  number  of  nets  that  may  leave  this 
module.  Roughly  speaking,  the  problem  we  consider  consists  in  finding  an 
assignment  of  the  items  to  the  modules  such  that  certain  technical  restrictions 
are  satisfied  and  a  very  complicated  objective  function  is  optimized. 

The  most  important  constraints  are  the  following: 

The  sum  of  the  weights  of  the  items  that  are  assigned  to  one  module 
must  not  exceed  its  capacity. 

For  every  module  k  £  M  the  following  requirement  must  be  satisfied: 
The  total  number  of  nets  that  contains  an  item  assigned  to  k  and  some 
other  item  assigned  to  some  module  I  6  M  \  {k}  must  be  resticted  by 
the  cut  capacity  Sk  of  the  module. 

Let  a  :  N  —*  M  denote  some  eissignment  of  the  items  to  modules  such  that 
the  constraints  are  satisfied.  The  objective  value  of  this  assignment  a  is  of 
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the  form 

5^  I<k  •  tc)t(a)  +  A  •  ec(a). 

keM 

Let  us  first  focus  on  the  first  term  of  the  objective  function,  i.e.,  the  internal 
cost.  Kk  is  a  constant  which  corr^ponds  to  the  fabrication  cost  for  module  k. 
Roughly  speaking,  the  internal  cost  of  some  module  depend  on  the  number 
of  wires  that  must  be  routed  within  this  particular  module.  In  fact,  the 
internal  costs  are  a  staircase  function.  This  is  due  to  the  fact  that  every 
module  consits  of  several  layers  on  which  the  routing  of  the  wires  taikes  place. 
Depending  on  the  number  of  wires  that  must  be  routed  within  some  module, 
a  certain  number  of  layers  is  required.  A  jump  of  the  stair  case  fuction  occurs 
whenever  additional  layers  for  some  module  are  necessary,  since  the  number 
of  wires  exceeds  a  certain  threshold.  The  external  cost  ec{a)  represent  the 
number  of  nets  running  between  different  modules.  The  parameter  A  is  an 
estimation  of  the  cost  for  one  wire. 

From  a  mathematical  point  of  view  this  application  has  the  flavour  of  both 
a  packing  aspect  and  a  multi-partitioning  aspect.  The  packing  aspect  arises 
from  the  fact  that  certain  items  must  be  assigned  to  modules  such  that  given 
capacities  are  not  exceeded.  Similarly,  one  has  to  decide  which  nets  are 
connected  via  a  wire  within  which  module  such  that  the  given  cut  capacities 
are  still  satisfied.  On  the  other  hand,  the  multi-partitioning  aspect  is  present 
as  well,  since  the  number  of  nets  connecting  items  which  are  assigned  to 
different  modules  has  a  strong  impact  on  the  objective  function. 

We  modell  this  problem  as  an  integer  program  with  linear  objective  function. 
Due  to  the  very  complex  objective  function  we  obtain  a  model  which  involves 
several  clumsy  and  technical  conditions.  Moreover,  for  practical  applications, 
the  model  requires  several  hundreds  of  thousands  integer  variables.  Thus,  we 
decided  to  study  relaxations  of  this  problem.  Working  in  this  scheme  a  first 
relaxation  consists  in  the  multiple  knapsack  problem,  which  can  be  viewed  as 
the  task  of  assigning  a  given  set  of  items  to  a  given  set  of  modules.  Here, 
we  introduce  boolean  variables  i  ^  N,k  ^  M  with  the  interpretation 
Xik  =  1,  if  item  i  is  assigned  to  module  k  and  x.^  =  0  otherwise.  The  re¬ 
laxation  considers  just  the  assignment  of  items  to  modules  such  that  the 
correponding  area  capacity  is  taken  into  account.  The  number  of  nets  run¬ 
ning  between  different  modules  as  well  as  the  cut  capacity  of  the  modules  is 
completely  neglected.  The  second  relaxation  extends  the  multiple  knapsack 
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model  in  Ihe  sense  that  nets  running  between  different  modules  are  approxi¬ 
mately  taken  into  account.  For  every  pair  of  items  i,j  such  that  there  exists 
a  net  connecting  both  i  and  we  introduce  an  additional  O/l-variables  y^j 
with  the  following  interpretatio  i:  y,,  1  if  *  and  j  axe  assigned  to  different 

modules  and  y,j  =  0,  otherwise.  Using  these  variables  we  estimate  the  num- 
b  .r  of  nets  running  between  different  modules  by  the  number  Viy 

yields  to  a  multidimensional  graph  axtitioning  problem.  The  last  relajcation 
we  are  going  to  consider  improves  the  approximation  of  the  nets  and  leads 
to  a  multidime;: sional  partitioning  problem  m  a  hypergraph.  Rather  than 
introducing  variables  yij  betwe  n  pairs  of  ‘  ems,  we  eissociate  a  variable  ZtJk 
with  every  module  k  ^  M  and  every  net  Tj.  The  variable  is  set  to  1,  if  a 
proper  subset  of  the  set  of  itemi  Tt  is  assigned  to  module  k  and  is  set  to  zero, 
otherwise  This  class  of  vari  bles  ena  les  us  to  model  the  cut  capacities  of 
the  modules  as  well  as  to  count  the  number  of  nets  running  between  different 
modules. 

Wi'h  each  of  these  relaxations  we  .associate  a  polyhedron  whose  vertices  axe 
in  one  to  one  correspondence  to  the  feasible  solutions  of  the  proper  model. 
Then,  solv'ng  o:.e  of  the  models  reduces  to  optimizing  a  linear  objective 
function  over  the  corresponding  polyhedron.  In  order  to  apply  linear  pro¬ 
gramming  techniques,  we  need  a  descr.ption  of  the  polytope  by  means  of 
equations  and  inequalities.  Thus,  a  first  ;  tep  in  solving  these  problems  via  a 
polyhedral  approach  consists  i  ,  a  concise  study  of  the  underlying  polyhedra. 
In  th's  talk  .e  wUl  report  on  the  fac  a:  structure  of  the  three  relaxations 
s  well  In  partii  ular,  .here  ’s  a  nice  relationship  between  facet-defining 
inequal'tics  fo.  :he  three  polyhedra. 

First  one  can  prove  that  every  facet  for  the  multiple  knapsack  polytope 
defines  a  facet  "or  the  multidimensional  graph  partitioning  polyhedron.  Thus, 
the  facial  structure  cf  the  multiple  kna.psack  polytope  is  completely  inherited 
by  the  mul  .ipartitioning  graph  polyhedron.  Similarly,  valid  inequalities  for 
the  polytope  associated  with  the  second  relaxation  can  be  modified  such 
that  they  ave  valid  for  an  appropriate  multidimensional  hypergraph  polytope. 
Unfortunately,  not  every  facet  for  the  multidimensional  graph  partitioning 
poly'.ope  is  inherited  by  the  corresponding  hypergraph  polytope.  Yet,  there 
are  several  examples  where  we  may  resort  to  facet-defining  inequalities  for 
the  multidimensional  graph  partitioning  polytope,  and  by  modifying  them, 
we  obtain  facets  for  the  hypergraph  polytope.  An  example  of  this  kind  is  the 
so-called  cycle  inequality  which  we  will  discuss  in  the  talk. 
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We  conclude  by  giving  some  remarks  on  the  relationship  between  the  third 
relaxation  and  the  original  problem.  Here,  it  turns  out  that  the  third  re¬ 
laxation  takes  the  original  side  constriiints  into  account.  The  only  difference 
between  the  two  of  them  is  that  the  objective  function  in  the  hypergraph 
model  is  simplified  and  provides^ust  a  heuristic  estimate  of  what  is  really  to 
be  minimized.  However,  if  one  is  able  to  handle  the  multipartitioning  hyper¬ 
graph  polytope  from  a  theoretical  as  well  as  from  a  practical  point  of  view, 
one  could  start  with  some  objective  function  and  optimize  over  this  poly¬ 
tope.  If  the  solution  is  feasible  for  the  original  model  we  stop.  Otherwise,  we 
modify  the  estimate  of  the  objective  function  for  the  hypergraph  model  in  a 
lagrangean  fashion  and  repeat  this  process  until  we  terminate  with  a  globally 
feasible  solution.  Surely,  the  solution  provided  that  way  is  not  necessarily 
optimal  for  the  original  problem.  However  the  objective  function  is  somehow 
related  to  the  original  one  and  thus,  an  optimal  solution  to  the  relaxed  model 
that  is  still  feasible  for  the  original  one  should  be  not  too  bad.  In  particular, 
one  should  expect  that  it  meets  the  requirements,  practioners  are  interested 
in. 

At  least  from  our  point  of  view,  this  type  of  approach  (providing  a  series 
of  reasonable  relaxations  to  a  very  complex  problem  and  handling  the  re¬ 
laxations  theoretically  as  well  as  practically)  is  best  what  one  can  expect, 
since  theoretical  and  practical  progress  up  to  date  is  still  far  away  from  solv¬ 
ing  large  scale  real  world  problems  to  optimality,  which  are  as  complex  as 
problems  occuring  in  the  design  of  main  frame  computers. 
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1.  Introduction 

Spacebome  communication  antennas  are  often  required  to  illuminate  an  ir¬ 
regular  coverage  region  on  the  earth.  To  achieve  this  effectively  the  radiated 
beam  is  shaped  in  order  to  concentrate  the  power  on  the  region.  A  shaped 
beam  also  known  as  a  contoured  beam  can  be  obtained  with  an  offset 
parabolic  reflector  with  multi-feed  array  shown  in  figure  1.1.  This  antenna 
generates  a  set  of  small  element  beams.  Each  element  beam  is  generated  by 
a  feed  that  radiates  towards  the  reflector  and  the  element  beam  is  the  reflect¬ 
ed  secondary  field.  Each  feed  is  a  small  metal  horn  which  transmits  to  free 
space.  The  feeds  are  arranged  in  an  array  such  that  the  corresponding  ele¬ 
ment  beams  together  cover  the  region  (see  figure  1.1).  The  antenna  input  is 
transformed  into  feed  excitations  i.e.  input  amplitudes  and  phases  by  a 
beam  forming  network.  Thus  the  element  beams  are  excited  corresponding¬ 
ly  and  the  contoured  beam  is  obtained. 

The  classical  contoured  beam  optimization  problem  is  then  given  the  anten¬ 
na  to  adjust  the  excitations  to  maximize  the  power  gain  within  the  coverage. 
Additionally,  isolation  regions  may  be  included  in  which  the  power  level 
shall  be  suppressed.  Several  procedures  have  been  proposed  for  this  prob¬ 
lem  including  minmax  formulations  to  maximize  the  minimum  power  gain 
([1],[2]&[3]). 
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Here  we  shall  describe  a  procedure  in  which  the  feed  array  parameters  are 
included  into  the  optimizadon.  The  shape  of  an  element  beam  is  highly  de¬ 
pendent  on  the  feed  aperture.  Feeds  with  circular  or  squared  apertures  create 
a  power  gain  pattern  with  ciicu&r  contours.  Rectangular  feeds  leads  to  elon¬ 
gated  element  beams  (figure  1.2).  Because  of  size,  weight  and  losses  it  is 
desirable  to  keep  the  number  of  feeds  as  small  as  possible.  A  well-fitting 
contoured  beam  can  be  obtained  with  a  limited  number  of  non-circular  ele¬ 
ment  beams.  The  size  and  position  of  the  corresponding  feeds  are  usually 
found  by  hand.  As  the  number  of  parameters  characterizing  the  feeds  is 
large  it  is  likely  that  a  manually  adjusted  array  is  far  from  optimum. 

2.  Field  calculations 

The  secondary  far  field  from  a  separate  rectangular  feed  must  be  calculated 
in  different  far-field  directions.  The  parameters  of  a  rectangular  feed  are  the 
aperture  dimensions  a  and  b  and  the  position  of  the  aperture  centre  Xf  and  yf 
in  the  focal  plane  (Fig.  3.1). 

The  focal  plane  coordinates  are  denoted  Xf,yf,Zf  and  the  corresponding  basis 
vectors  Xf,  yf,  Zf.  A  unit  direction  vector  is  defined  by  Pf  =(Uf,Vf,Wf)  = 
(Xf,yf,Zf)/rf ,  where  rf  =  (x|  4-y|+z|)^.  The  radiated  electrical  field  from  the 
feed  can  be  written  as 

Ef  =  A— ^  (YfWf  -  (xf  YfUf  +  yf  •YfVf)zf)  f(uf,Vf) 

where  A  is  a  normalization  constant,  j  is  the  imaginary  unit  (j2=-l )  and  k  is 
the  wave  number  related  to  the  speed  of  light  c  and  the  frequency  v  by 
k=2jiv/c.  The  unit  vector  yf  is  either  equal  to  Xf  or  yf  depending  on  the  feed 
polarization.  The  function  f(Uf,Vf)  is  the  Fourier  transfoim  of  the  aperture 
field  of  the  feed  denoted  hf(Xf,yf),  where  a  simple  model  is  used 


172 


i.  .  for -a/2<Xf  <a/2 

cos(«yf/b)  and -b/2<yf  <b/2  (2-2) 

0  otherwise 

Next,  the  magnetic  field  H  radiated  from  the  feed  is  found  from  H  =rfxEf. 
On  the  reflector  surface  a  current  distribution  J(x,y,z)  is  induced,  calculated 
by  the  so-called  physical  optics  approximation,  J(x,y,z)=2nxH,  where  n  is 
the  unit  surface  normal.  The  field  and  currents  are  here  considered  to  be 
functions  of  the  coordinates  x,y,z  of  the  antenna  coordinate  system  (see 
Figure  1.1).  Hereafter  the  secondary  field  radiated  by  the  reflector  can  be 
found  from  ([6]) 

Efar  =  Jj(j('-)-(j(r)-f)-f)ej‘^"  ds  (2.3) 

A 

which  gives  the  electrical  far  field  in  the  direction  f.  The  vector  p=(x,y,z)  is 
the  integration  variable  in  the  surface  integral  over  the  reflector  area  A.  The 
quantities  of  interest  are  the  polarization  components  e^.^  and  e^.^  obtained 
from  Efap  by  the  projections  e^j,  =Efar’  ®cx  “Efar’  ®cx  where 

and  Cj-jj  forms  the  desired  polarization  basis.  (*  denotes  the  complex  conju¬ 
gate.) 

3.  Array  topology  and  minmax  formulation 
The  feeds  are  mounted  with  their  apertures  in  the  focal  plane  (figure  1.1). 
During  the  optimization  the  apertures  will  vary  in  size  and  position.  The 
feed  array  parameters  cannot,  however,  vary  independently  since  no  aper¬ 
ture  overlaps  must  occur.  To  avoid  a  considerable  number  of  linear  con¬ 
straints  we  have  chosen  an  approach  where  the  array  must  be  composed  of 
a  collocation  of  rows  of  rectangular  apertures  (Figure  3.1).  Let  the  feed  co¬ 
ordinate  system  in  the  focal  plane  have  axes  parallel  to  the  aperture  edges 
and  let  the  rows  be  organized  in  the  Xf-direction.  Then  all  feed  of  an  internal 
row  must  have  the  same  height  and  yf-coordinate  for  the  aperture  center 
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whereas  the  heights  of  the  feeds  in  the  top  and  bottom  row  can  vary  inde¬ 
pendently.  All  vertical  edges  of  neighbouring  feed  aperture-  in  a  row  must 
coincide  and  all  horizontal  edges  of  neighbouring  rows  must  coincide.  With 
this  topology  the  independent  variables  are: 

-  One  common  height  for  all  feeds  in  each  mtemal  row  and  individual 
heights  for  the  feeds  in  the  bottom  and  top  rows. 

-  For  each  row  the  x  rcoordmate  of  the  aperture  center  of  one  of  the  feeds 
and  individual  widths  of  all  feed  apertures 

-  yf-coordinate  of  the  aperture  center  of  a  main  reference  feed  and  rotation 
of  the  corhplete  array. 

The  maximum  number  of  variables  that  can  be  used  equals  N+2R+n 
where  N  is  the  number  of  feeds,  R  the  number  of  rows  and  n  j  and  nj^  the 
number  of  feeds  in  the  bottom  and  top  r:w  resp  This  number  may  be  re¬ 
duced  if  identical  feeds  are  required.  Due  to  field  model  limitations  bounds 
are  needed  on  the  aperture  dimensions.  Let  e  R"a  be  a  vector  with  the 
chosen  array  variables,  n^  <  N+2R+n  j+n^. 

The  desired  power  gain  is  specified  ove  •  a  se'  of  s>nthesis  stations  ade¬ 
quately  sampled  to  define  the  coverage  and  isolation  regions.  Let  the  com¬ 
plex  number  ejj(Xjj)  denote  the  far  field  a;  the  I'th  station  of  the  j’th  element 
beam  found  as  discussed  in  section  2  excited  by  unit  amplitude  and  zero 
phase.  The  complex  vector  CjfXg)  C  holds  the  values  from  the  N  elements. 
Further,  let  aj  denote  the  j'th  complex  excitat’on  where  Re(aj)  =  AjCos(Phj) 
and  Imoj  =  AjSin(Phj),  Aj  and  Phj  being  the  j’th  excitation  amplitude  and 
phase.  The  complex  excitations  are  elements  of  the  vector  a  e  C^.  Since  the 
optimization  is  performed  in  real  variables  we  use  x^  e  R^^‘^  such  that 
«j(Xe)  =  +  •  X2j),  j=l,..,N-l  and  a^CXp)  =  (•.2N-1  *  0)-  Therefore  2N- 

1  independent  variables  are  available,  since  phase  is  a  relative  quantity.  The 
total  vector  of  independent  variables  is  then  the  concatenation  x  =  x^Z/Xg, 

X  €  R",  n=2N-l+na. 
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With  the  above  notation  the  power  gains  at  the  station  can  be  expressed  as 

(i3t(Xc)  ei(Xe)  Xa(Xc)  ei(jg  )* 

Pi(x)=  - ,  i=l,...,M  (3.1) 

a(Xc)ii(Xe)* 

where  M  is  the  total  number  of  stations.  (See  [3]).  For  each  synthesis  sta¬ 
tion  a  residual  function  is  now  defined  which  is  the  difference  between  the 
realized  gain  normalized  by  the  factor  Gq  and  a  specified  relative  gain  goal 
Pio’  i-e- 


fi(x)  =  Wi(pi(x)/G3  -  pio)  (3.2) 

where  Gg  should  be  slightly  above  the  expected  peak  power  gain.  The  spec¬ 
ified  relative  power  gain  is  used  to  express  the  station  levels  of  the  desired 
pattern,  such  that  pjg  >=  1  for  a  coverage  residual  and  pjg  »  0  for  an  isolation 
residual.  The  weights  are  used  to  equalize  the  size  of  coverage  and  isolation 
residuals.  The  minmax  problem  to  be  solved  consist  of  determining  x  e  R" 
which  minimizes  the  maximum  residual  - 

F(x)  =  min  F(x)  =  min  max  lfj(x)l  (33) 

xe  R"  xe  R"  l<i<M 

4.  Solution  of  the  minmax  problem 

The  problem  (3.3)  is  solved  by  the  approximate  gradient  version  of  the  gen¬ 
eral  minmax  method  of  Madsen  [4].  This  is  an  iterative  trust  region 
method.  In  each  iteration  the  residuals  are  linearized  and  the  linear  model 
function  is  minimized  subject  to  a  bound  on  the  solution.  The  proposed  step 
is  accepted  as  the  next  iterant  if  F  decreases.  Otherwise  the  step  is  repeated 
with  a  reduced  bound.  To  solve  the  problem  on  small  computers  some  ad¬ 
ditions  were  needed.  In  each  step  worst  and  near  worst  case  residuals  are 
identified  and  then  only  these  are  linearized.  Thus  the  storage  needed  is  re¬ 
duced  by  approx.  80%.  Gradient  approximations  are  obtained  by  a  combi- 
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nation  of  Broyden's  rank  one  formula  and  differences.  The  linear  model 
function  is  minimized  by  the  exchange  method  of  Powell  [5].  In  a  re¬ 
designed  implementation  an  option  for  starting  with  a  good  guess  of  the 
futal  set  of  active  linearized  residuals  defining  the  solution  has  been  added. 
Then,  in  the  non-linear  minimax  the  set  of  active  linearized  residuals  found 
in  one  step  is  used  as  the  guess  for  final  set  of  the  next. 

5.  An  example 

One  of  the  test  cases  for  the  procedure  was  the  Brazilsat  Antenna  System 
discussed  by  Patel  &  Chan  [2].  The  requirements  were  27  dBi  for  a  high 
gain  region  shown  as  a  polygon  on  figure  5.2  and  25  dBi  for  the  rest  of 
Brazil.  (dBi  is  the  power  level  above  isotropic  level  in  dB  (=101oglpjl)).  The 
antenna  consists  of  an  1803  mm  offset  reflector  illuminated  by  6  rectangu¬ 
lar  feeds  as  shown  in  figure  5.1.  A  total  of  97  synthesis  stations  were  used 
and  the  total  number  of  variables  was  25.  The  original  array  and  excitations 
were  used  as  initial  point  for  the  iteration. 

A  minmax  optimization  using  only  excitations  as  variables  yields  28.61  dBi 
and  26.61  dBi  for  the  high  and  low  gain  zones  resp.  (Fig.  5.2).  The  result 
from  the  optimization  with  array  parameters  included  is  shown  in  figure 
5.3.  The  power  gains  are  29.74  dBi  and  27.74  dBi  for  the  two  zones. 

6.  Concluding  remarks 

General  methods  for  non-linear  minmax  problems  have  been  used  success¬ 
fully  for  contoured  beam  antenna  optimization  with  the  excitations  of  the 
feeds  as  variables.  If  the  array  consists  of  different  sized  rectangular  feeds 
the  performance  of  the  antenna  can  be  improved  further,  if  the  feed  array  pa¬ 
rameters  are  used  as  variables  together  with  the  excitations. 
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Abstract : 

We  study  a  production  management  in  clothing  industry.  The  model  we  use  is  a  large 
scale  linear  integer  programming  problem  with  a  lot  of  structure.  Lagrangean  relaxation 
method  coupled  with  heuristics  yields  a  good  bracketing  of  optimal  solution  by  dualizing  the 
state  equations.  Subgradient  technique  is  used  to  solve  Lagrangean  dual,  each  iteration 
reduces  to  exact  solving  of  knapsack  problems. 


Key  words  :  Lagrangean  relaxation,  integer  linear  program,  production  problem, 
subgradient  optimization. 
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1.  Introduction 

We  consider  a  problem  of  production  management  in  process  industry  application.  This 
problem  can  be  formulated  as  a  large-scale  linear  integer  programming  problem  strongly 
structured. 

The  paper  develops  a  Lagrangean  relaxation  technique  that  successively  solves  a 
sequence  of  generalized  knapsack  problems.  A  near  optimal  solution  is  obtained  via  a 
subgradient  method  coupled  with  an  heuristic.  Numerical  experiments  with  real  and  randomly 
generated  instances  are  in  progress  to  validate  the  approach. 

Section  2  deals  with  a  short  description  of  the  production  system  and  a  deterministic 
model  formulation.  Section  3  describes  a  suitable  strategy  for  large-scale  instances;  each 
iteration  of  an  ad  hoc  Lagrangean  relaxation  reduces  to  exact  solving  of  small-size  knapsack 
problems. 

2.  Model  formulation 

We  study  an  inventory  production  management  problem  in  the  clothing  industry.  An 
effective  production  planning  system  determines  the  appropriate  levels  of  production  and 
inventory  according  to  fluctuating  demand  requirements  and  minimum  costs. 

Generally  speaking,  a  manufacturing  production  system  in  the  clothing  industry  can  be 
viewed  as  a  sequence  of  transformations  applied  to  raw  materials  to  obtain  finally  a  finished 
product. 

In  the  formulation,  external  subsystems  such  as  extraction  and  transportation  of  the  raw 
materials  are  neglected.  Then  the  process  can  be  subdivised  into  three  subsystems  (see 
figure  ) : 

-  transformation  of  the  raw  materials  into  raw  pieces; 

-  shaping  of  the  raw  pieces  into  shaped  pieces: 

-  assembling  of  the  intermediate  subproducts  to  provide  the  finished  items. 

Each  subsystem  is  characterized  by  the  following  decision  variables:  vector  of  products, 
stocks  and  demands.  We  assume  that  a  discrete  deterministic  model  is  available  and  that 
external  demands  of  the  global  system  are  given  ([4],[8]). 

Let  N  denotes  the  number  of  items  to  fabricate,  m  cardinality  of  ^  set  of  raw  materials 

index,  J  cardinality  of  Ej  references  set  of  elementary  pieces  forms,  L  c  Ej  *  the  number  of 

intermediate  subproducts  of  transformation  and  shaping  subsystems,  T  the  number  of  periods 
in  the  planning  horizon  and  k=3  the  number  of  subsystems. 

The  following  underscripts  are  also  used  in  the  description  of  the  model, 
k  ;  index  of  system  (k  =  1 ,2,3 ); 
t  :time period (t  =  1,...,T); 

i :  index  of  finished  product  if  ks1  (i=1,...,N)  or  of  intermediate  subproduct  if  k  =  2,3 

(i  =  1 . L). 

We  now  introduce  vectors  of  decision  variables  of  the  problem  ; 

Yt'  =  [  Vu  1  ;  number  of  product  i  to  be  produced  during  period  t  in  subsystem  k. 

k  k 

Ut  =  I  Uj,i  ] :  number  of  storage  i  during  period  t  in  subsystem  k. 

dJ'  =  [  d,;,  ] :  demand  of  product  i  during  period  t  in  subsystem  k  (k=2,3) 

□t  =  [  d|^,  ] :  given  external  demand  of  product  i  during  period  t . 
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Planning  Horizon  of  the  manufacturing  production  system 

figure  :  Discrete  production  system  in  manufacturing 

The  following  vectors  of  data  are  required  for  formulation 

R,*'  =  [  r|^,  ] :  availability  factor  of  product  i  of  subsystem  k  during  period  t. 

sl'  =  ( sj^t  ]  •  availability  factor  of  storage  i  of  subsystem  k  during  period  t. 

(p,  :  resource  production  availability  of  subsystem  k  during  period  t. 

ty,  :  storage  of  resource  availability  of  subsystem  k  during  period  t. 

=  I  yi!i  ]  •  maximum  of  production  allowed  per  period  for  item  i  of  subsystem  k. 

ilT  =  [  tt.t  ]  i  ^1  =  I  “u  1  •  lower  and  upper  bounds  for  storage  of  product  i  from 
subsystem  k  during  period  t. 

C,’*'  =  ( q’l*  ] :  unit  production  cost  over  period  t  for  item  i  of  subsystem  k. 
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cf*'  =  [  cj^  ] ;  unit  storage  cost  over  period  t  for  item  i  of  subsystem  k. 

3k  3k 

C,  =  [  C| ,  ]  :  cost  of  modifications  when  fabrication  changes  from  a  period  to  another, 
and  the  matrix  =  (a,') : 

a[ :  number  of  necessary  pieces  I  to  fabricate  item  i. 

The  constraints  of  the  problem  are  either  technological  or  economical.  These  constraints 
and  production  cost  are  assumed  to  be  linear.  We  suppose  that  the  objective  function  is  also 

k  k 

linear  according  to  product  stock  (u,,)  and  change  of  activity  level  from  a  period  to 

k  k 

another  (yi,|-yi,|.i).  Hence  the  model  can  oe  formulated  as  the  following  large-scale  integer 
linear  program  with  2(N+3L)T  variables  and  (3N+8L+6)T  constraints: 


min  z=  IT  Ci’Vf +  cf''uI'  +  cf(Y^Y,^) 

K.tt.  1 

S.t. 


r^k  wk  -  k 

R,  Y,  <<p, 

(k-1) 

o<y}' 

(k-2) 

^  k  1  |k  ^  k 

S,  U,  <V, 

(k-3) 

ii,  S  Ut  <  tJ, 

(k-4) 

e.i 

(k-5) 

of  =  n  y’ 

(2-6) 

Dt  =  yf 

(3-6) 

Y,\  U,\  of integer  k=1.2.3 

t=1....T 

k  k  33 

The  initial  conditions  Uq  and  Yq  are  given  and  set  to  zero  and  we  assume  C,  =0. 


3  Lagrangean  approach 

The  coupling  constraints  (2-6)  and  (3-6)  that  link  the  subsystems  are  used  to  eliminate 

2  3 

only  the  decision  variables  0  '  .  After  rearranging  terms,  the  problem  can  be  stated  as  an 
equivalent  linear  program  with  2(N+2L)T  variables  and  (3N+6L+6)T  constraints  : 
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(P) 


min  V  I{F,'Sf  +  0?“  Uj  } 
k-l  t=l 
S.t. 

rJ'y!'  <  (p^ 

0  5  yJ  <  “y|' 


sfu^  < 

'  1  1 

I{  Ye-De  } 

01=  1 

(1-5) 

i{  Ye-ii  Y^  } 

0=  1 

(2-5) 

i  {  Y^  Y^e  } 

6=  1 

(3-5) 

integer 

k=1,2,3 

with  F’'‘  =  C,’^cf'‘-Cu1  andC?i,=0 

By  dualizing  the  state  equations  ((k*5),  k  =1,2,3)  with  a  multiplier  V  one  obtains  the 
Lagrangean  relaxation  subproblem  : 

min  cste  +  i  I{GJ''y1^  +  G?'*  uJ  ) 

k>l  (>1 

£  'Yf 

sj'ui'  ^  vJ 

ill' s  s  “u{' 

integer  k-1,2,3  t=l,..,T 


{LR{V)) 


0  S  Y, 


Yf  . 


where  V-(V;.vf,vf)  v|‘-(v5) 


=  p"+  Z{vJ-vJn} 

8«l 

ivJ 

e-( 


cste< 


T  1  '  1 

-  iv;  I  De 

1-1  e«i 


,12 


-12 


.1 


G,'  I{Ve-VQ£l} 

8*i 
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Each  subproblem  (LR(V))  car>  be  decomposed  into  6T  knapsack  problems  over  period 

K  k 

t,  subsystem  k  and  variables  Y,  and  U, .  The  first  3T  subproblems  are  knapsack  problems 
with  upper  bounded  variables 


((LR1(V)][') 


1  k  . ,k 

min  G,  Y, 

s.t.  k=1 ,2,3 

<(p';  t=1,..,T 

0  <  yJ  < 


and  the  3T  other  subproblems  are  knapsack  problems  with  lower  and 
variables  : 


([LR2(V)lJ') 


„2k  ,,k 

min  G,  Ut 

s.t.  k=1 ,2,3 

t=1,...T 

Ulj'  <  u{  <  'ul' 


upper  bounded 


The  corresponding  Lagrangean  dual  given  by 

r  max  v(LR{V)) 

L  s.t.  (V) 

where  v(.)  denotes  the  optimal  value  of  problem  (.),  is  solved  by  a  subgradient  algorithm  ([3]) 
and  v(D)  is  a  lower  bound  of  v(P). 


4  Preliminary  conclusions 

This  approach  is  a  suitable  alternative  to  the  one  provided  by  Soenen  ([8]).  In  his 
paper,  the  size  of  the  model  is  reduced  to  (N  +  2L)T  vanables  and  (2N  +  4L+6)T  constraints  by 
using  the  state  equations  (k-5)  and  coupling  constraints  (2-6)-(3-6)  to  eliminate  the  vanables 

U  '  '  and  D^'^.  Then  the  Dantzig-Wolfe  decomposition  ([2])  is  applied  to  solve  the  LP- 

relaxation  and  an  integer  feasible  solution  is  obtained  by  rounding  the  LP-optimal  solution. 
The  author  has  also  suggested  to  split  the  problem  by  creating  copies  of  the  original  variables 

Yf  of  the  shaping  subsystem.  He  was  among  the  first  ones  to  suggest  the  idea  of  variables 

splitting  and  later  Guignard  and  Kim  ([6], [7])  formalized  this  idea  to  a  general  mixed-integer 
programming  problem.  But  in  this  case,  the  subproblems  induced  by  the  dualization  of  the 
copy  constraint  do  not  reduce  to  knapsack  problems.  Though  our  model  has  twice  more 
variables,  the  approach  is  attractive  when  the  size  increases.  First  it  is  well-known  that  the 
bound  provided  by  the  Lagrangean  dual  is  generally  tighter  than  the  LP  one.  Secondly,  at 
each  iteration  of  the  subgradient  algorithm,  the  subproblems  are  knapsack  of  small-sizs  and 
easy  to  solve  exactly.  Moreover  feasible  solutions  can  be  constructed,  starting  from  the 
optimal  solution  of  each  subproblem,  to  furnish  a  bracket  of  the  optimal  solution  of  the  initial 
problem  (P). 
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The  algorithm  was  implemented  in  GAMS  2.25  (General  Algebriac  Modeling  System 

[1])  with  solvers  ZOOM  or  LAMPS.  First  numerical  experiments  show  that  our  approach  is 
suitable  for  large-scale  instances. 
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Abstract 

We  consider  various  relaxations  and/or  decompositions  tor  solving  linear  integer  minimax 
problems.  The  main  results  concern  the  comparison  of  the  bounds  they  provide  and  necessary 
and  sufficient  conditions  to  obtain  sharper  bound  or  null  duality  gap  with  Lagrangean 
decomposition. 


1  Introduction 

Confronted  with  the  problem  of  minimizing,  in  integers,  the  maximum  of  several 
functions,  one  usually  introduces  an  extra  variable,  say  y,  to  be  minimized,  and  writes 
constraints  which  force  y  to  be  no  less  than  these  functions.  These  new  contraints  destroy 
whatever  structure  the  problem  had  initially,  and  render  its  resolution  much  harder.  One  can 
obtain  lower  bounds  on  the  optimal  value  of  y  by  relaxing  these  constraints  and  then  optimize 
the  bound  thus  obtained.  We  will  consider  several  relaxations  and  compare  the  bounds  they 
provide.  We  will  also  study  some  specific  minmax  models  and  provide  preliminary 
computational  evidence  on  the  quality  of  the  bounds. 

In  section  2  we  show  how  various  relaxations  and  decompositions  compare  in  terms  of 
the  bound  they  provide  when  only  the  new  constraints  are  dualized.  In  section  3  we  consider  a 
two-level  relaxation  scheme,  where  complicating  constraints  of  the  initial  structure  require 
relaxation.  An  illustration  of  the  main  results  is  provided  by  an  example  in  section  4. 


Notation 

We  shall  use  the  following  notation.  Given  a  constrained  optimization  problem  (  ).  (  .  )  will 
denote  its  continous  relaxation,  FS(.)  its  feasible  set,  OS(.)  its  optimal  set,  i  e  the  set  of  all  its 
optimal  solutions,  and  u(.)  its  optimal  value.  Co{S)  will  denote  the  convex  hull  of  a  set  S  of  R'’ 
and  the  superscript  t  transposition. 
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2.  Minimax  constraints  dualization 

Consider  the  following  linear  minmax  problem  (P) 

min  max  Fj(x) 
xe  S  i=1  ,p 

where  S  is  a  discrete  subset  of  and  the  Fj's  ,i  =  I . p,  linear  functions  of  the  form  F|(x)  = 

fix+gi. 

Let  us  now  introduce  a  new  variable  8  to  represent  the  maximum  of  the  p  functions  Fi(x)  _ 
and  let  us  rewrite  (P)  as 

min  {  5  I  Fx+g  $8e,  x  e  S  } 
x,8 

where  F  is  the  pxn  matrix  (fi]i.i,p.  g  the  vector  (gi),.i,p  and  e  the  all  one  vector  {1  ,...,1)  of  RP. 

The  new  constraints  Fx+g  <  8e  destroy  whatever  stmcture  the  set  S  had  initially,  and  render  the 
resolution  of  problem  (P)  much  harder. 

The  first  basic  results  concern  the  comparison  of  bounds  provided  by  various  relaxations 
which  are  obtained  by  only  dualizing  the  minimax  constraints  Fx+g  <  5e. 

We  introduce  the  set  of  multipliers  U  =  {  ueRP  |  u  S  0,  ^  Uj  =  1  }  and  the  maxmin  dual 

iss) 

problem  (Q) 

max  min  u(Fx+g) 
ueU  xeS 

As  S  is  not  a  convex  set,  the  classical  minimax  inequality  holds  and  we  have  the  following 
inequalities 

u{P)  =  min  {  max  {  Fi(x)  |  i  =  }  |  xeS  } 

=  min  {  max  {  u(Fx  +  g)  |  ueli }  |  xeS  } 

2  max  {  min  {  u(Fx  +  g)  |  xeS  )  |  ugU  }=  U(Q) 

Hence  the  minimax  duality  gap  a  =  u(P)  -  u(Q)  is  due  to  the  nonconvexity  of  S.  Compactness 
and  convexity  of  S  are  sufficient  but  not  necessary  conditions  to  have  o  =  0.  When  S  is 
compact,  arguments  based  on  Lagrangean  duality  (Geoffrion,  [3])  lead  to  the  equivalence 
between  (Q)  and  the  following  linear  program  (P*) 

min  { 5 1  Fx+g  <  5e,  X€  Co(S) } 
x,5 

Therefore  the  duality  gap  o  can  be  positive  only  when  the  feasible  set  {  5 1  Fx+g  <  Se,  X€Co(S) 

}  has  noninteger  vertices. 

We  now  compare  the  optimal  value  u(Q)  with  lower  bounds  of  (P)  provided  by  three 
different  relaxations  of  the  minimax  constraints  Fx+g  s  5e. 
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Problem  (P)  is  equivalent  to  (P°),  in  which  we  create  multiple  copies  of  5  as  a  p-vector  z  ; 


min  {  I  Fx+g  <  z,  xeS,  z  =  Ez } 
x,z 


1  ifj=i+1,i=1 . p-1 


where  E  is  a  (p,p)  cyclic  permutation  matrix  such  that:  E\=  1  ifj=1,i=p 

LO  otherwise 

We  now  dualize  the  copy  constraint  z=Ez  with  multiplier  u  and  obtain  the  Lagrangean 
decomposition  subproblem 


(LD(u)) 


min  {(p(u)z  I  Fx+g  <z,  xeS  } 
x,z 


where  <p  is  the  linear  function  on  defined  by  (p(u)  =  u(l-E)+^’  and  I  the  identity  matrix. 

The  corresponding  Lagrangean  dual  (LD)  is 

max  u(LD(u)) 

where  u(LD(u))  =  i  min  { (  ^  +uj  -  Uj.,)zj  1  Fjx+gj  <  Zj,  xeS  } 

We  also  define  the  Lagrangean  and  the  surrogate  duals  of  (P)  relative  to  the  minimax 
constraints  Fx+g^5e: 


where 


where 


r  max  u(LR(u)) 

L  u>0 

0(LR{u))  =min(  S  +  u(Fx+g-8e)  I  xeS }  =  ug  +  min  { (1-ue)5+uFx  |  xeS }, 


r  max  u(SD(u)) 

L  u>0 

u(SD(u))  =  min  {  5  j  u(Fx+g)  <  5ue,  xeS  }. 


The  following  theorem  states  that  all  the  above  mentioned  duals  of  (P)  provide  identical 
lower  bounds  equal  to  u(Q) 


Theoremi 


U(P)  S  U(Q)  =  U(LD)  =  U(SD)  =  U(LR) 


To  conclude  the  section,  we  give  sufficient  conditions  under  which  the  duality  gap  o 
equals  zero. 

Proposition  2. 

Let  u*  be  an  extreme  point  of  the  convex  set  U  =  {  ueRP  |  u  >  0.  ^  uj  =  1}  and  let  h 

i=1 

denote  the  index  such  that  u*h  =  1 .  If  there  exists  an  optimal  solution  x*  of  the  relaxation 

(Q(u*)):  min  {u*(Fx+g)  |  xeS  }  such  that:  max  {(Fx*+g)i  |  i  =i . p}  =  (Fx*+g)h 

then  u*  €  OS(Q),  x*  e  OS(P  )  and  o  =  0.  0 
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3.  integer  problems  with  special  structure 

In  the  section  2,  we  have  assumed  that  the  relaxed  problems  max{  fu(x)  |  x  e  S  ).  tor 
linear  functions  fu{x)  =  u(Fx+g),  could  be  solved,  at  least  reasonably  easily,  so  that  no 
relaxation  of  the  constraints  defining  S  was  necessary.  In  this  section,  we  shall  consider  cases 
for  which  S  contains  so-called  ‘complicating”  constraints,  requiring  relaxation  of  some  of  the 
constraints  in  S.  The  bounds  may  not  be  as  strong  as  those  described  above,  because  of  the 
two-level  relaxation,  yet  they  may  still  provide  one  of  the  best  approaches  to  solve  such 
problems. 

So  we  consider  now  the  case  where  the  constraint  set  S  can  be  partitioned  into  two 
subsets  S  =  {x  I  Ax  <  b,  Cx  s  d,  X  e  ft  },  where  ft  c  R"’  x  Z"  '”  is  a  discret  subset  and  Ax<  b 
are  the  complicating  constraints.  The  problem  (P)  is  then  equivalent  to 

min  { 5  |  Fx+g  <  5e,  Ax  <  b,  Cx  <  d,  x  e  ft } 
x,5 


First  we  compare  the  bounds  obtained  by  relaxing  the  minimax  constraints  Fx+g  <  6e 
and  the  complicating  constraints  Ax  ^  b.  The  following  relaxations  differ  in  the  way  one  dualizes 
the  minimax  constraints  and  the  complicating  constraints. 

Problem  (P)  is  equivalent  to  problem  (P'),  i=1 ,2,3,  in  which  we  introduce  multiple  copies 
of  5  as  a  p-vector  z  =  (zi ,  Z2. ....  Zp),  and  one  copy  y  of  x  : 

/pi)  min  {  ^‘z  I  Fx+g  Sz,  Ay  s  b,  Cx  S  d,  X  =  y,  z  =  Ez,  xeX,  yey  ) 

/p2)  min  {  I  Fx+g  <  z,  Ax  <  b,  Cx  <  d,  z  =  Ez,  xe  ft } 

'  '  [  x,z 

2  min  {  5 1  Fx+g  s  5e,  Ay  <  b,  Cx  <  d,  X  =  y,  xeX,  yey  } 

^  x,y,8 

with  Xny  =  ft. 


We  dualize  all  the  copy  constraints  in  (P^)  and  obtain  a  first  Lagrangean  dual 


(LD1)  [ 


max  u(LD^(u.v)) 

u.  V 


where  U{LD’  (u,v))  =  min  {<p(u)z  -  vx  |  Fx+g  S  z,Cx  s  d  ,  x  e  X )  +  min  {vy  |  Ay  ^  b,  y  e  y } 

In  (P2) .  we  dualize  the  copy  constraints  with  multiplier  u  and  the  complicating  constraints 
with  multiplier  w,  to  obtain  the  second  Lagrangean  dual 


(LD2)  [ 


max  u(LD2(u,w)) 
u  ,wsO 


where  u(LD2(u,w))  =  -wb  +  min  (9(u)z  +  wAx  |  Fx+g  s  z,Cx  <  d  ,  x  €  ft ) 
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Finally,  in  (P3)  ,  we  dualize  the  copy  constraints  and  the  minimax  constraints  and  obtain 
the  third  Lagrangean  dual 


(LD3) 


max  u(LD3(u,v)) 
u  s  0  ,v 


where  u(LD3(u,v))  =ug  +  min  ((1  -ue)S  +-(uF  -  v)x  |  Cx  <  d  ,  x  e  X  }  +  min  {  vy  |  Ay  <  b,  y  e  y  } 


We  also  define  the  standard  Lagrangean  dual  by  relaxing  the  constraints  Fx+g  <  5e  and 
Ax  <  b 

p.  r  max  u(LR(u.w)) 

I  u  S  0  ,w  a  0 

where  u(LR(u,w))  =  -wb+ug+  mm  { (1-ue)8+(uF+  wA)x  |  Cx  <d,x  e  Q } 

The  next  proposition  sums  up  the  relationships  between  all  these  Lagrangean  duals. 

Proposition  3  u(LD’)  =  u(LD3)  and  u(LD2)  =  u(LR)  0 

Since  (LD’)  and  (LD^)  are  equivalent,  we  will  call  (LD)  this  one  true  Lagrangean 
decomposition  of  (P),  and  since  (LD^)  and  (LR)  are  equivalent,  we  still  simply  call  (LR)  this 
Lagrangean  relaxation. 

To  discuss  the  quality  of  tfm  lower  bounds  provided  by  (LD)  and  (LR),  we  introduce  the 
LP  relaxation  of  (P)  denoted  by  (  P  ) 


—  r  min  {  5  I  Fx+g  <  5e,  Ax  s  b,  Cx  <  d,  X  e  co(Q) ) 

^  ^  ^  L  x,6 

The  main  properties  may  be  summarized  as  follows: 

Theorem  4  _ 

(i)  u(Q)  2  u(  P  ):  if  (P)  has  the  Integrality  Property: 

{  X  I  Ax  <  b,  Cx  <  d,  X  e  co{Q.) }  =  co(  x  |  Ax  <  b,  Cx  <  d,  x  e  } 
then  u(Q)  =  u(  P  ) . 

(ii)  If  y  is  convex,  then  u(LD)  <  u(LR) .  _ 

(iii)  If  y  is  convex  and  co(fl)  =  co(X)r>y ,  then  u(LD)  >  u(  P  ) . 

(iv)  U(Q)  S  mzix  {  U(LD),  U{LR)i } . 

(v)  If  X  =  n  then  u(LR)  S  U(  P  );  if  (P)  has  the  partial  Integrality  Property: 

{  X  I  Cx  s  d.  X  €  co(Q) }  =  co{  X  (Cx  <  d,  X  e  Q } 

then  u(LR)  =  u{  P  ) . 

(vi)  If  X  =  y  =  il  then  u(LD)  5  u(LR);  if  (P)  has  the  partial  Integrality  Property: 

(  X  I  Ax  <  b,  x  €  co(n)  }  =  co{xlAx<b,  X€  Q} 
then  u(LD)  =  u(LR).  0 

Theorem  4  is  important  for  recognizing  the  cases  where  Lagrangean  Decomposition 
could  provide  tighter  bounds  than  Lagrangean  or  LP  relaxations.  This  will  happen  frequently 
when  X  =  y  =  £1  and  (P)  has  not  the  partial  Integrality  Property  {  x  |  Ax  s  b,  x  e  co(Q) }  =  co{  x 
I  Ax  s  b,  X  E  n } . 
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The  following  theorem  gives  sufficient  conditions  to  obtain  u(LD)  equal  to  u(P)  and 
generalizes  proposition  2. 

Theorem  5  Let  (u*,v)  e  OS(LD)  and  (x*,y*)  e  OS(LD(u*,v)).  If  x*  =  y*  then 

(i)  For  any  multiplier  v  ^  0,  U(LD(u*,v))  =  u(Q{u*)) . 

(ii)  If  u*  e  OS(Q)  then  u(LD)  =^(Q) . 

(iii)  If  u*  is  an  extreme  point  of  U  and  max{(Fx*+g)i  |  i  =i . p}  =  (Fx*+g)h  where  h  denotes 

the  index  such  that  u*h  =  1,  then  x*  e  OS(P)  and  u(LD)  =  u(Q)  =  u(P) .  0 

4.  Example 

The  sufficient  conditions  given  above  are  useful  in  practice  and  easy  to  check.  Consider 
the  following  makespan  problem  (Pa.p)  with  2  machines  and  2  jobs  (see  Escudero  [1]  for  a  full 
description  of  the  problem  and  also  (7,8)): 

min  max  (4xi+ 3x3  +  4x5  +  4x7,  4X2  +  2X4  +2x6  +3x8)  (Fx +g) 

s.t. 

Xl  +  X2  =  1  X3  +  X4  =  1 

xi  <X5  X3  <xf 

X2  s  X6  X4  <  Xe 

4x  1  +  3X3  +  4x5  +  4X7  S  a 

4X2  +  2X4  +  2x6  +  3X8  S  p 

Xie  {0,1}  i  =  1 . 8 

The  first  (resp.  second)  machine  is  available  a  (resp  p)  units  of  time,  X5  (resp.  xe) 
represents  the  assignment  of  job  type  1  to  machine  1  (resp.  machine  2)  with  a  potential  setup 
time  of  4  (resp.  2)  units  of  time.  Similarly,  xj  (resp.  xe)  represents  the  assignment  of  job  type  2 
to  machine  1  (resp  .  machine  2)  with  a  potential  setup  time  of  4  units  (resp.  3).  Finally  xi  (resp. 
X2)  correspond  to  the  assignment  of  job  1  to  machine  1  (resp.  machine  2)  with  processing  times 
of  4  on  both  machines.  Similary,  X3  and  X4  play  similar  roles  for  job  2,  with  processing  times  of  3 
and  2  units  respectively. 

For  problem  (P7,6)  fh®  sufficient  conditions  of  theorem  5  (iii)  are  satisfied  by  u*=  (1 ,0), 
V*=  (4,4, 3,4,4, 4,4, 4)  and  x*=  y*  =(0,1, 1,0, 0,1, 1,0).  One  has  u(LD(u*,v*))=7,  thus  o(P)  =  u(0)  = 
U(L.D)  =  7  and  x*  e  OS(P). 

It  is  important  to  notice  that  alone  condition  x*=y*  of  theorem  5  is  not  sufficient  for 
optimality  as  in  (Guignard,  Kim  (4,5,6)).  Indeed  consider  (Pi5,ii);  it  is  easy  to  show  that : 

.  U(P)=  7  and  OS(P)  ={(0,1 ,1 ,0.0,1 .1 .0)} 

44  3  4 

.  U(Q)  =  u(Q(u‘))  =  y  with  u*  =  (7  ,  y )  which  is  not  an  extreme  point  of  U. 

.  U(LD)  =  U(LD(u*,v*))  =  U(Q)  with  v*  =  u*F  =  ^(12,16,9,8,12,8,12,12);  x’  = 

(1,0,0, 1,1 ,0,0,1)  and  x2  =  (0,1, 0,1 ,0,1, 0,1)  are  such  that  (x’,x’)  and  (x2,x2)  belong  to 

OS(LD(u*,v*)),  but  U(LD)  <  U(P). 


(Ax  <  b) 

(Cx  <  d) 
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5.  Concluding  remarks 

We  have  presented  various  relaxations  and  decompositions  wich  can  be  applied  to 
minmax  integer  programming  problems.  As  an  alternative  to  subgradient  optimization,  we  have 
also  extend  column  generation  to  solve  Lagrangean  decomposition  duals  ([2]).  This  technique 
has  been  applied  to  the  minimization  ol  excess  capacity  in  loading  problems  and  makespan  in 
flexible  manufacturing  systems.  All  the  proofs  are  contained  in  {[2]). 
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INTRODUCTION 

The  design  of  any  process  system  for  producing  desired  products  from  available  raw 
materials  almost  always  involves  process  network  synthesis  (PNS).  A  process  system  is  a 
network  of  operating  units,  each  of  which  transforms  a  specified  number  of  input  materials 
with  known  quality  into  a  specified  number  of  output  materials  by  altering  their  physical, 
chemical,  or  biological  properties.  The  importance  of  PNS  arises  from  the  fact  that  essentially 
every  product  of  the  chemical  and  allied  industries  is  manufactured  by  such  a  network; 
moreover,  the  profitability  of  the  same  product  from  different  networks  may  vary  widely. 

The  MINLP  model  of  PNS  contains  a  large  number  of  binary  variables  associated  with 
the  operating  units.  This  renders  the  model  difficult  to  solve  by  any  available  method  without 
exploiting  the  specific  features  of  process  structures  and  the  model.  Although  its  complexity  is 
exponential,  the  branch  and  bound  method  has  the  advantages  of  being  independent  of  an 
initial  structure;  guaranteeing  the  optimality  provided  that  the  bounding  algorithm  exists;  and 
being  capable  of  incorporating  combinatorial  algorithms.  Nevertheless,  the  general  branch  and 
bound  method  is  inefficient  in  solving  the  MINLP  model  of  PNS  because  a  large  number  of 
NLP  subproblems  is  generated  and  the  number  of  free  variables  is  unnecessarily  large  for  each 
subproblem,  i.e.,  many  of  such  free  variables  are  associated  with  operating  units  that  need  be 
excluded  from  any  feasible  solution  of  this  subproblem. 

Combinatorial  analysis  of  the  MINLP  model  of  PNS  and  that  of  feasible  process 
structures  yield  mathematical  tools  for  exploiting  the  unique  characteristics  of  PNS.  These 
tools  can  accelerate  the  branch  and  bound  search  for  the  optimal  solution  by  minimizing  the 
number  of  subproblems  to  be  solved  and  by  reducing  the  size  of  an  individual  subproblem 
through  exclusion  of  the  binary  variables  and  constraints  of  those  operating  units  that  must  not 
be  included  in  any  feasible  solution  of  the  subproblem.  This  algorithm  has  been  validated  on 
the  basis  of  combinatorial  analysis  of  process  structures  and  has  been  applied  for  solving 
industrial  instances  of  PNS. 
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STRUCTURE  REPRESENTATION  IN  PNS 

The  simple  directed  graph  is  effeaive  in  representing  structures  of  general  network 
problems  [1];  however,  it  is  unsuitable  for  PNS  as  demonstrated  by  simple  examples  [2], 
Structure  representation  with  enhanced  sophistication  is  required  for  PNS. 

Let  M  be  a  given  set  of  objects^  usually  material  species  or  materials  that  can  be 
converted  or  transformed  by  the  process  under  consideration.  Transformation  between  two 
subsets  of  M  occurs  in  an  operating  unit.  It  is  necessary  to  link  this  operating  unit  to  other 
operating  units  through  the  elements  of  these  two  subsets  of  M.  The  resultant  structure  can  be 
described  by  a  directed  bipartite  graph,  termed  a  process  graph  or  P-graph  in  short,  which 
alleviates  the  difficulty  encountered  in  representing  a  process  structure  by  a  simple  directed 
graph. 

Dermition  1.  Let  M  be  a  finite  set,  and  let  set  O  ^  |i(M)  x  p(M)  with  M  n  O  =  0, 
where  p(M)  denotes  the  power  set  of  M.  Pair  (M,  O)  is  defined  to  be  a  process  graph  or 
P-graph',  the  set  of  vertices  of  this  graph  is  M  U  O,  and  the  set  of  arcs  is  A  =  Aj  U  A2  with 
Ai  =  {(x,  Y)  I  Y  =  (a,  ft)  e  0  and  X  e  a}  and  A2  =  {(Y,  x)  |  Y  =  (a,  B)  G  O  and  x 
G  B}.  P-graph  (M',  O')  is  defined  to  be  a  subgraph  of  (M,  O),  i.e.,  (M',  O')  £  (M, 

O),  if  M'  £  M  and  O'  £  0.  Let  (Mj,  Oj)  and  (M2,  O2)  be  two  subgraphs  of  (M, 

O).  The  union  of  (Mj,  Oj)  and  (M2,  O2)  is  defined  by  P-graph  (M|  U  M2,  Oj  U 
O2)  denoted  by  (Mj,  Oj)  U  (M2,  O2);  obviously,  this  union  is  a  subgraph  of  (M, 

O).  If  (a,  B)  is  an  element  of  O,  then,  set  a  is  the  input-set  of  (a,  B),  while  set  B  is  its 

output-set.  The  set  of  arcs  incident  into,  out  of,  and  to  vertex  x  are  denoted  by  w'(x),  w'*'(x), 
and  u(x),  respectively.  The  indegree,  d*,  and  the  outdegree,  d'*’,  of  vertex  x  are  defined  by 
d*(x)  =  I  «*(x)  I  and  d'^(x)  =  |  a>‘^(x)  |  .  The  degree  of  vertex  x  is  defined  by  d(x)  = 
d'(x)  +  d  +  fx).  Since  sets  w'(x)  and  w'^fx)  do  not  intersect  for  a  P-graph,  we  have  d(x)  = 

I  w(x)  I  . 

MINLP  MODEL  OF  PNS 

Let  us  consider  a  PNS  problem  in  which  the  set  of  desired  products  is  denoted  by  P;  the 
set  of  raw  materials,  by  R;  and  the  set  of  available  operating  units,  by  O  =  {oj,  02,  .  .  .  , 
Op}.  Moreover,  let  M  be  the  set  of  materials  belonging  to  these  units,  and  assume  that 
PnR  =  0,pcM,  R£M,  andMnO=0.  Then,  P-graph  (M,  O)  contains  the 
interconnections  among  units  of  O.  Furthermore,  each  feasible  solution  of  this  problem 
corresponds  to  a  subgraph  of  (M,  O).  For  any  1  ^j^n,  let  yj  =  1  if  oj  is  contained  in  this 
subgraph  and  yj  =  0  otherwise.  Thus,  this  subgraph  is  determined  by  the  vector  (yj,  y2.  •  •  • 

,  y„).  Let  A  =  {aj,  a2,  .  .  .  ,  a^}  be  the  set  of  arcs  and  continuous  variable  xj^  (k  =  1,2, 

.  .  .,  r)  be  assigned  to  arc  aj^.  The  function,  for  which  ^  ({ajj,  . a: })  =  (xjj,  xj^,  .  . 

.,  XjJ  holds  for  any  subset  {aj^,  aj^,  •  .  .  ,  aj^}  of  A,  is  denoted  by  <>.  Finally,  continuous 
variaole  zj  is  assigned  to  operating  unit  Oj  for  j  =  1,  2, .  .  .,  n. 

The  constraints  on  and  the  cost  of  operating  unit  oj  can  be  expressed,  respectively,  by 
gj(yj,  <>(«(0j)),  Zj)  ^  0,  j  =  1,  2,  .  .  .,  n 

fj(yj,  <f'(w(oj)),  zj),  j  =  1,  2, .  .  .,  n 

where  for  a  fixed  value  of  yj,  both  fj  and  gj  are  nonlinear,  differentiable  functions  on  the 
practically  interesting  domain  for  j  =  1,  2, .  .  .,  n. 
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Similarly,  the  constraint  on  and  the  cost  function  of  vertex  mj  can  be  given,  respectively, 
as  follows: 


gi'(«^(w(mi)))  rS  0, 

i=  1,2,  ...,1 

fi'(^(w(mi))). 

i  =  1,  2, - 1 

In  practice,  g'  and  f  are  usually  linear.  The  cost  function  of  the  PNS  problem  is  the  sum  of 
the  costs  of  the  materials  and  operating  nnits  involved. 

COMBINATORIAL  STRUCTURE  OF  PNS 

In  general,  no  arbitrary  vector  (yj,  y2,  .  .  .  ,  y^)  (yj  G  {0,  1},  i  =  1,  2,  .  .  .  ,  n)  can 
define  a  feasible  process  structure.  The  feasible  process  structures  have  some  common 
combinatorial  properties  [2]  that  have  been  expressed  implicitly  in  the  MINLP  model.  Since 
each  feasible  process  structure  must  have  these  combinatorial  properties,  the  set  of  subgraphs 
of  (M,  O),  considered  in  solving  the  model,  can  be  reduced  to  the  set  of  combinatorially 
feasible  process  structures  or  solution-struaures  in  short. 

Definition  2.  Subgraph  (M',  O')  of  P-graph  (M,  O)  is  defined  to  be  a  solution-structure 
of  PNS  given  by  set  P  of  products  and  set  R  of  raw  materials  if 

(51)  P  S  M',  i.e.,  every  final  product  is  represented  in  P-graph  (M',  O'); 

(52)  V  X  G  M',  d‘(x)  =  0  iff  X  G  R,  i.e.,  a  vertex  from  M'  has  no  input  if  and  only  if  it 
represents  a  raw  material; 

(53)  V  u  G  O',  3  path  (u,  v]  in  (M',  O'),  where  v  G  P,  i.e.,  every  vertex  from  O'  has  at 
least  one  path  leading  to  a  vertex  representing  a  final  produa;  and 

(54)  V  X  G  M',  3  (a,  B)  G  O'  such  that  x  G  (a  U  6),  i.e.,  any  vertex  from  M'  must  be 
an  input  to  or  output  from  at  least  one  vertex  from  O'. 

The  set  of  solution-structures  is  denoted  by  S(P,  R,  O);  its  important  properties  are 
expressed  by  the  following  theorem,  lemma,  and  corollaries. 

Theorem  1.  S(P,  R,  O)  is  closed  under  union. 

Lemma.  If  (M',  O')  G  S(P,  R,  O),  then,  M'  =  U  (a  U  B). 

(a,6)GO' 

The  direct  consequence  of  this  lemma  is  the  following  corollary. 

Corollary  1.  Let  (M',  O’)  G  S(P,  R,  O);  then,  (M’,  O’)  is  uniquely  determined  if  set  O' 
is  given. 

The  maximal  structure,  defined  below,  plays  an  essential  role  in  PNS. 

Definition  3.  Let  us  assume  that  S(P,  R,  O)  ^  0.  The  union  of  all  solution-structures 
of  PNS  is  defined  to  be  its  maximal  structure;  it  will  be  denoted  by  /r(P,  R,  O),  i.e., 

fi(P,  R,  O)  =  U  a. 

ff€S(P,R,0) 

Since  the  set  of  solution-structures  is  finite  and  closed  under  union,  the  maximal  structure 
also  is  a  solution-structure;  this  leads  to  the  following  corollary. 

Corollary  2.  /i(P,  R,  O)  G  S(P,  R,  O). 

Naturally,  the  optimal  solution  need  not  be  concerned  with  any  operating  unit  not 
included  in  the  maximal  structure.  Since  any  optimal  solution  is  a  solution-structure,  the 
MINLP  model  of  PNS  can  be  based  on  the  maximal  structure.  For  this  reason,  let  us  suppose 
that  S(P,  R,  O)  0,  and  also  let  us  denote  the  maximal  structure,  /i(P,  R,  O),  by  (M',  O'). 
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A  polynomial  algorithm  is  available  for  the  generation  of  the  maximal  structure  [3]. 

BUILDING  BLOCKS  OF  THE  ACCELERATED  BRANCH  AND  BOUND  METHOD 

Essentially,  the  branch  and  bound  method  yields  the  optimal  solution  of  a  mathematical 
programming  problem  by  generating  and  solving  some  simplified  subproblems.  Suppose  that 
we  have  three  sets  Iq,  I],  and  If  (any  pair  of  them  is  disjunct)  and  that  Iq  U  Ij  U  I  f  =  (1,  2, 
.  .  n}.  These  sets  define  one  subproblem  of  the  branch  and  bound  method.  In  this 

subprobiem,  Iq  and  1}  are  the  sets  of  indices  of  binary  variables  whose  values  are  zero  and 
one,  respectively,  and  If  is  the  set  of  indices  for  the  free  variables  of  this  subproblem,  i.e.,  the 
value  of  any  of  these  variables  is  supposed  to  be  in  closed  interval  [0,1]. 

Suborohlem  Generation 

The  structures  of  some,  or  often  most,  subproblems,  defined  by  Iq,  Ij,  and  If,  are  not 
substructures  of  any  solution-structure;  these  subproblems  are  said  to  be  structurally  infeasible, 
as  will  be  delineated  later.  Only  structurally  feasible  subproblems  should  be  generated. 

Definition  4.  Let  /i(P,  R,  O)  =  (M',  O').  Then,  P-graph  (m*,  o*)  is  a 
subsolution-struaure  of  PNS  given  by  P  of  products  and  set  R  of  raw  materials,  if 

(551)  for  X  G  m*,  d"(x)  =  0,  if  x  G  R; 

(552)  0*  c  O': 

(553)  V  u  G  O',  3  path  (u,  v]  in  (M*,  O’),  where  v  G  P; 

(554)  V  X  G  M',  3  (a,  6)  G  O'  such  that  x  G  (a  U  B). 

Let  S*(P,  R,  O)  denote  the  set  of  subsolution-structures;  note  that  (0,  0)  G  S*(P, 
R,  O).  If  (m*,  0*)  G  S*(P,  R,  O),  then,  (m*,  o*)  S  ^(P,  r,  O). 

Theorem  2.  S(P,  R,  0)  c  S*(P,  R,  O). 

For  a  given  subsolution-structure,  a*  =  (m*,  o*)  (G  S*(P,  R,  O)),  let  us  define  set  r* 
such  that 

r*  =  {x  1  X  G  (m*  \  R  U  P)  and  d"(x)  =  0} . 

Theorem  3.  Let  a*  G  S*(P,  R,  O);  then,  a*  G  S(P,  R,  O),  if  and  only  if  r*  =  0. 

The  accelerated  branch  and  bound  algorithm  is  based  on  algorithm  SSG,  given  in  Figure 
1.  Algorithm  SSG  generates  each  solution-structures  exactly  once  and  generates 
solution-structures  only.  It  does  so  by  determining  the  decision-mappings  of  some 
subsolution-structures  (see  the  APPENDIX).  These  subsolution-structures  define  structurally 
feasible  subproblems  of  the  MINLP  model  of  PNS.  The  validity  of  algorithm  SSG  has  been 
proved  by  resorting  to  the  following  theorems. 

Theorem  4.  Decision-mapping  6[m)  of  algorithm  SSG  is  consistent,  and  it  is  a 
subsolution-structure. 

Theorem  5.  Algorithm  SSG  generates  all  solution-structures  whose  decision-mappings 
are  the  extensions  of  d(m]. 

Theorem  6.  0  is  a  decision-mapping  of  algorithm  SSG. 

Theorem  7.  No  decision-mapping  is  generated  more  than  once  by  algorithm  SSG. 

Theorem  8.  Decision-mapping  5[m]  of  algorithm  SSG  is  a  solution-structure  if  and 
only  if  set  p'  of  algorithm  SSG  is  empty. 
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Input:  P,  R,  M,  o(x)  (x6M)  ; 

Comment:  P£M,  RSM,  Oc#,(M)xp(M),  PnR=0, 

o(x)  =  {(a,B)  I  (a,B)60  &  x6B},  o(x)=0  «  x€R  ; 

Output:  all  solution-structures  of  the  PNS  problem  ; 

Gltrikd  variables:  R,  o(x)  (xGM) ; 

begin 

if  P  =  0  then  stop  ; 

SSG(P,  0.0); 

end 

procedure  SSG(  p,  m,  5[m] ) : 
begin 

if  p  =  0  then  begin  write  6[m]  ;  return  end 
let  X  €  p  ; 

C;  =  p(o(x))\{0}  ; 

For  all  c  6  C  do 

begin 

ifVy€m,  cn(o(y)  \  5(y))=  0  &  (o(x)  \  c)n6(y)=0 

then 

b«*in 

5(mU{x}]  :=  6[m]U{(x.c)}  ; 

SSG(pU(  U  a  )\(RUmU{x}),  mU{x},  6[mU{x}l  )  ; 

(a,B)ec 

end 

end 

return 

end 


Figure  1 .  Algorithm  SSG 

Theorem  9.  Only  one  decision-mapping  of  algorithm  SSG  may  belong  to  a 
solution-structure. 


Subproblem  Given  bv  a  Subsolution-Structure 

Let  us  define  a  mapping,  denoted  by  ind,  that  yields  the  set  of  indexes  for  the  elements  of 
a  subset  of  O.  Moreover,  let  6[m)  be  a  decision-mapping  of  subsolution-structure  a*,  and  also 
let  S'  be  the  following  set; 

S'  =  {(T  I  <r  €  S(P,  R,  O)  and  the  decision-mapping  of  a  is  an  extension  of  d[m]}. 

If  this  set  is  not  empty,  then,  £[m]  and  the  subproblem  determined  by  5{m]  are  defined  to  be 
structurally  feasible. 

Theorem  10.  Suppose  that  5(m]  is  a  structurally  feasible  subsolution-structure  and  that 

structure  a  is  defined  by  a  =  U  a'.  Then,  all  solution-structures  whose  decision-mapping 

ff'es’ 

are  the  extensions  of  6[m]  are  a  substructure  of  a.  Conversely,  a  is  minimal  with  this  property, 
i.e.,  for  any  structure  p  such  that  a<tp,  there  exists  solution-structure  o'such  that  a'  (Up. 
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Obviously,  decision-mapping  2[ni|  can  be  extended  to  decision-mapping  £[M*],  a 
decision-mapping  of  S'.  Then,  sets  Ij  and  If  of  a  subproblem  defined  by  6[m]  are 

1 1  =  ind(  U  6(x))  and 
xGm 

If  =  ind{  U  5'(x)), 
xeM 

where  6'[M]  =  6[M*1  \  5(m). 


ACCELERATED  BRANCH  AND  BOUND  ALGORITHM 

Based  on  the  building  blocks  mentioned  above,  an  accelerated  branch  and  bound 
algorithm,  algorithm  ABB  has  been  developed  for  solving  the  MINLP  model  of  a  PNS 
problem  (see  Figure  2).  This  algorithm  yields  the  optimal  solution  provided  that  the  bounding 


Input:  P,  R,  M,  o(x)  (xeM)  ; 

Comment:  P^M,  RSM,  Oc,j(M)xp(M),  PnR  =  0, 

o(x)={(a,B)  I  (a,fl)GO  &  xGfl},o(x)=0  xGR; 

Output:  optimal  solution  of  the  PNS  problem  ; 

Global  variables:  R,  o(x)  (xGM),  U,  currentbest ; 

begin 

U  :  =  oo  ;  currentbest :  *  anything  ; 
if  P  =  0  then  stop  ; 

ABB(  P,  0,  0  ) ;  if  U  <  oo  then  print  currentbest  else  print  'there  is  no  solution*  ; 
end 

procedure  ABB(  p,  m,  5(ml )  ; 
begin 

let  xGp;  C  :=  p(o(x))\{0}  ; 

For  all  c  G  C  do 
begin 

ifV  y€m,  cn5(y)  =  0  &  (o(x)  \  c)  n  6(y)  =  0 

then  begin 

5{mU{x}]  :=  5(mlU{(x,  c)}  ; 

p'  :=  pU(  U  a  )\(RUmU{x}) ;  m'  :=  m  U  {x}  ; 

(a,B)ec 
if  p'  =  0 

then  begin 

U  :=  min(  U,  BOUND(  m’,  0,  6(m'] ) ) ;  update  currentbest ; 

end 

elselfRSG(p',  m',  B[m’),  M,  6(m'UMl) 

then  if  U  S  BOUND(  m',  M,  6[m'UM] )  then  ABB(  p’,m',  5[m'] ) ; 

end 

end 

return 

end 


Figure  2.  Algorithm  ABB 
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algorithm  exists,  by  generating  structurally  feasible  subproblems  only.  Moreover,  the  size  of 
each  subproblem  is  reduced  by  excluding  the  binary  variables  and  constraints  of  those 
operating  units  that  can  not  be  included  in  any  feasible  solution  of  the  subproblem  (see  Figure 
3  for  procedure  RSG). 


procedure  RSG(  p',  m',  6[m'J,  M,  6[m’  U  M] ) : 
p;=  p'  ; 

M  :=  0  ; 

while  p  is  not  empty  do 
begin 

X  e  p  ; 

M  :=  MU{x}  ; 

6(x):=o(x)\(  U  5(y)); 
y€m' 

if  6(x)  =  0  then  return  false  ; 
p:=pU(  U  a)\(RUm*UM); 
(a,B)66(x) 

end 
return  true 
end 


Figure  3.  Procedure  RSG 


Example 

The  accelerated  branch  and  bound  algorithm  has  generated  6325  subproblems  for  an 
industrial  PNS  problem  involving  35  operating  units  in  the  worst  case  [2],  This  is  about  one 
millionth  of  the  number  of  the  subproblems  generated  by  the  general  branch  and  bound 
algorithm  in  the  worst  case.  The  reduction  in  the  number  of  free  variables  of  each  subproblem 
of  the  accelerated  branch  and  bound  algorithm  is  also  essential. 
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APPENDIX 
Decision  Mappings 

To  generate  a  certain  class  of  subgraphs  of  a  graph,  e.g.,  a  set  of  feasible  structures,  a 
special  technique,  decision-mapping  has  been  developed  to  organize  the  system  of  decisions. 
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Decision-mapping  is  a  special  tool  to  render  our  decisions  consistent  and  complete  in  dealing 
with  complex  decision-problems  such  as  PNS. 

Let  us  suppose  that  for  finite  sets  M  and  O,  O  £  pCM)  x  |7(M)  holds;  moreover,  for 
X  G  M,  let  us  define  set  o(x)  by  o(x)  =  {(o,  fl)  |  (a,  6)  G  O  and  x  G  fi};  naturally,  pair 
(M,  O)  is  a  P-graph. 

Definition  5.  Let  us  suppose  that  sets  M  and  O  satisfy  O  ^  <7(M)  x  |;(M)  and  set  m 
be  a  subset  of  M.  Moreover,  let  6(x)  be  a  subset  of  o(x),  for  x  G  m.  Then, 
^[i°]  {(x,  5(x))  I  X  G  m}  is  defined  to  be  a  decision-mapping  on  its  domain  m. 

Definition  ti.  Decision-mapping  ^ifmj]  is  defined  to  be  the  restriction  of 
decision-mapping  —  “*2  =  {(’t,  62W)  I  ^  ^  “'ll- 

Definition  7.  The  complement  of  decision-mapping  6[m]  is  defined  by 
5*fmJ  =  {(x,  y)  I  X  G  m  and  y  =  o(x)  \  5(x)}.  Thus,  6*(x)  =  o(x)  \  5(x)  for  x  G  m. 

Definition  8.  Decision-mapping  6(m|  is  consistent  if  |  m  |  ^  1  or  (5(x)  n  5(y))  U 
(6*(x)  n  6*(y))  =  o(x)  n  o(y)  for  any  x,  y  G  m. 

Theorem  11.  Decision-mapping  6(ml  with  |  m  |  ^  1  is  consistent  if  and  only  if  6(x)  n 
5*(y)  =  0  for  all  X,  y  G  m. 

Theorem  12.  Decision-mapping  5}(m^]  is  consistent  if  6][m|]  S 
62(m2]  is  a  consistent  decision-mapping. 

Definition  9.  For  consistent  decision-mapping  6(mj,  let  o  =  U  6(x),  m  = 

xGm 

U  (a  U  fl)  Urn,  and  6'(m)  =  {(x,  y)  |  x  G  m  and  y  =  {(a,  C)  |  (ot,  fl)  G  o  and  x 
(a.fl)€o 

G  fl}};  then,  6'(m]  is  defined  to  be  the  closure  of  6(m),  and  6[m]  is  said  to  be  closed  if  5[m] 
=  5'(m].  The  closure  of  a  consistent  decision-mapping  is  closed. 

Theorem  13.  Let  5'[m]  be  the  closure  of  consistent  decision-mapping  6[m];  then, 
6(x)  =  5'(x)  for  all  X  G  m,  i.e.  5(m|  is  the  restriction  of  6'(n>]  to  m. 

Corollary  3.  If  6'[m]  is  the  closure  of  consistent  decision-mapping  5[m],  then  5(m] 
g  6'[m]. 

Theorem  14.  The  closure  of  a  consistent  decision-mapping  is  consistent. 

Definition  10.  Two  consistent  decision-mappings  are  equivalent  if  they  have  common 
closure.  ’ 

Naturally,  a  consistent  decision-mapping  is  equivalent  to  its  closure,  and  the  relation 
"equivalent"  is  an  equivalence  relation. 

Definition  11.  m'  is  said  to  be  an  active  domain  of  decision  mapping  6[mj  if  m'g  m, 

U  5(x)  =  U  6(x)  and  U  6*(x)  =  U  6*(x)  . 
xGm'  xGm  xGm'  xGm 

Note  that  m  is  always  an  active  domain  of  decision-mapping  6[m],  and  a 
decision-mapping  can  have  multiple  active  domains. 

Theorem  15.  Let  d(m]  be  a  consistent  decision-mapping;  then,  it  is  determined  on  its 
whole  domain,  m,  if  it  is  given  only  on  one  of  its  active  domains. 

Theorem  16.  If  a  decision-mapping  is  consistent  on  one  of  its  active  domains,  then, 
it  is  consistent. 

Definition  12.  Let  djlmij  and  32('"2l  consistent  decision-mappings  with  their 
closures,  3]'[mj]  and  52'[m2],  respectively.  Then,  6i[m]]  is  defined  to  be  an  extension  of 
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52l"‘2J 

(i)  mi  2  m2,  mj  2  m2  , 

(ii)  &2^m2]  is  the  restriction  of  ^ilmj]  to  m2,  i.e.,  6j(x)  =  62(*)  ^  ™2  » 

(iii)  5j'(*)  3  fi2'W  x  €  m2  \  m2. 

A  consistent  decision-mappings  is  an  extension  of  0.  That  {^[mi]  is  an  extension  of 
52[m2]  is  denoted  by  6|[m]]  >  52[m2]. 

Theorem  17,  5'[ml  >  6[m]  where  6’[mJ  is  the  closure  of  consistent 

decision-mapping  6(m]. 

Theorem  18.  The  relation  extension  is  a  partial  order  of  the  set  of  consistent 
decision-mappings. 

Let  P-graph  (m,  o)  be  a  subgraph  of  P-graph  (M,  O) 

Definition  13.  m'  is  an  aaive  set  of  P-graph  (m,  0),  if  m'  2  m  and  11  n  m'  0  for 
any  (a,  6)  €  o. 

Definition  14.  Let  m'  be  an  active  set  of  P-graph  (m,  o);  then,  6[m'J  is  defined  to  be 
a  decision-mapping  of  P-graph  (m,  0)  if  6[m’)  =  {(x,  y)  J  x  €  m'  and  y  =  {(a,  B)  |  (a,  fi) 
e  o  and  X  G  B}},  i.e.,  5(x)  =  {(a,  B)  |  (ct,  B)  G  0  and  x  G  B}  for  x  G  m'. 

Theorem  19.  The  decision-mappings  of  a  P-graph  are  consistent. 

Theorem  20.  Decision-mapping  5[ml  of  P-graph  (m,  o)  is  closed  if  set  m  is  active. 
Theorem  21.  An  active  set  of  P-graph  (m,  0)  is  an  active  domain  of  its 
decision-mapping  B[m]  if  set  m  is  aaive. 

Theorem  22.  The  decision-m^pings  of  a  P-graph  are  equivalent  provided  that  this 
P-graph  has  an  aaive  sa. 

Theorem  23.  Let  6(m'l  be  a  consistent  decision-mapping,  0  =  U  6(x)  and 

x€m' 

m  =  U  (aUB).  Then,  (m,  0)  is  a  P-graph,  m'  is  an  aaive  sa  of  P-graph  (m,  o),  and  6lm'] 
(a,B)€o 

is  a  decision-mapping  of  P-graph  (m,  o). 

Definition  15.  The  P-graph  of  consistent  decision-mapping  6(m’l  is  defined  to  be  (m,  0), 

where  0  =  U  6(x)  and  m  =  U  (a  U  B). 
xGm  (a,B)Gc 

Theorem  24.  An  aaive  domain  of  a  consistent  decision-mapping  is  an  active  sa  of  its 
P-graph. 

Theorem  25.  Equivalent  consistent  decision-mappings  have  the  same  P-graph. 


On  a  special  class  of  linear  programs  with 
an  additional  reverse  convex  constraint 
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H-1518  P.O.Box  63,  XL  Kende  u.  11-13.  Budapest,  Hungary. 

Abstract:  In  the  paper,  a  special  class  of  linear  programs  with  an  additional  re¬ 
verse  convex  constraint  is  treated.  The  problems  to  be  considered  have  the  special 
property  that  the  feasible  set  is  the  union  of  some  faces  of  the  polyhedron  deter¬ 
mined  by  the  linear  constraints.  Several  nonconvex  programming  problems  can 
be  written  into  this  form,  e.g.  the  minimum  linear  complementarity  problem,  the 
linear  disjunctive  programming  problem,  the  linear  bilevel  progrcimming  problem, 
the  problem  of  linear  optimization  over  the  efficient  set,  etc.  We  propose  a  finite 
method  based  on  convexity  and  disjunctive  cuts  for  solving  such  problems. 


1.  Introduction 


The  problems  to  be  considered  are  given  in  the  form 

min  c^x  s.t.  i  6  P,  g{x)  =  0,  (1.1) 

where  P  C  P"  is  a  nonempty  polyhedron,  c  is  an  7z-vector  and  ^  denotes  the 
transposition.  In  addition,  g  :  G  —*  R  is  a  concave  function  such  that  G  C  P”  is 
a  convex  set,  P  C  G  and 


g{x)  >  0  for  every  x  £  P. 

Because  of  the  last  property,  (1.1)  is  equivalent  to 

min  c^i  s.t.  i  €  P,  g{x)  <  0,  (1-2) 

which  is  the  form  of  a  linear  program  with  an  additional  reverse  convex  con¬ 
straint.  Several  methods  have  been  published  for  solving  linear  programs  with 
an  additional  reverse  convex  constraint,  see  e.g.  (6,8]  and  the  references  therein. 
However,  instead  of  applying  one  of  these  methods  directly,  we  propose  a  modi¬ 
fication  of  the  algorithm  presented  in  [6]  for  solving  (1.2).  This  is  motivated  by 
the  property  that  the  possibly  nonempty  feasible  set  of  (1.1)  is  the  union  of  some 
faces  of  P.  Consequently,  if  (1.1)  has  finite  optimal  value,  there  exists  a  vertex 
of  P  among  the  optimal  solutions.  For  a  linear  program  with  a  general  reverse 
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convex  constreunt,  we  can  state  only  that  an  optimal  solution  can  be  attained  on 
2U1  edge  of  P. 

In  Section  2,  we  propose  a  finite  cutting  plane  method  for  solving  (1.1).  In  Sec¬ 
tions  3  through  6,  we  show  that  a  wide  rzinge  of  nonconvex  programming  problems 
is  transformable  into  the  form-of  (1.1). 

2.  A  finite  cutting  plane  method 

We  assume  that  the  polyhedron  P  is  given  by  P  =  {i  6  P”  |  Ax  =  6,  x  >  0}, 
where  A  is  an  m  x  n  matrix  and  6  is  an  m-vector.  Let  X  denote  the  feasible  set 
of  (1.1). 

Proposition  2.1.  If  X  /  0,  then  X  is  the  union  of  some  faces  of  P. 

Proposition  2.2.  If  X  ^  0,  then  exactly  one  of  the  following  cases  holds: 

(i)  Problem  (1.1)  has  a  Unite  optimal  value  and  there  exists  an  extreme  point 
of  P  among  the  optimal  solutions; 

(ii)  The  objective  function  c^x  is  unbounded  below  over  X  emd  there  exists  an 
edge  F  of  P  such  that  F  C  X  and  c^x  is  unbounded  below  over  F. 

Consider  first  the  case  when  c^x  is  bounded  below  over  P,  e.g.  P  is  bounded. 
Let  ViP)  denote  the  set  of  the  extreme  points  of  P.  Consider  the  problem 

min  c^x  s.t.  x  6  V{P),  g{x)  =  0.  (2.1) 

By  Proposition  2.2,  problems  (1.1)  and  (2.1)  have  simultaneously  feasible  solution, 
and  any  optimal  solution  of  (2.1)  is  optimal  for  (1.1)  eis  well.  If  we  are  interested 
only  in  finding  the  optimal  value  emd  an  optimal  solution  of  (1.1),  it  is  enough  to 
solve  (2.1).  We  shall  deal  with  (2.1)  in  the  se^,uel. 

Any  x°  €  V^(P)  is  also  a  basic  feaisible  solution  of 

Ax  =  6,  X  >  0.  (2.2) 

For  a  feasible  basis  B  of  (2.2),  let  the  simplex  tabul2ur  form  of  (2.2)  be  determined 
by 

Xi  +  ^  QijXj  =  Oio,  i  e  Ib,  (2.3) 

j€Ib 

where  Ib  and  Ib  denote  the  index  sets  of  the  basic  and  nonbasic  variables,  respec¬ 
tively. 

Consider  an  x°  6  V'(P)  such  that  g{x^)  >  0.  Then,  x®  is  not  a  feasible  solution 
of  (1.1).  We  construct  a  convexity  cut  to  exclude  x°  from  the  further  search.  Let 
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J3  be  a  feasible  basis  belonging  to  x®.  Using  the  coefficients  of  the  simplex  tabular 
form  (2.3)  determined  by  B,  construct  the  vectors  £  R",  j  €.  I b,  defined  by 

1  for  fc  =  j, 

-aicj  forlre/B.  fc  =  l,...,n;  j  e  Ib-  (2.4) 

0  otherwise, 

For  every  j  £  Ibi  compute 

\j  =  sup  {A  I  r  +  \z^  €  G,  g{x  +  Az-’’)  >  0}.  (2.5) 

Clearly,  \j  >  0  for  every  j  £  Ib- 

Proposition  2.3.  Assume  that  A^  >  0  for  every  j  E  Ib-  Define  t  £  R"  by 

{1/Aj  for  j  £  Ib  and  A^  <  oo, 

^'=1 . n.  (2.6) 

0  otherwise. 

Then 

t^x°  <  1  and  ^(i)  >  0  for  every  x  £  P  D  (x  £  R"  f  i^x  <1).  (2.7) 

Proposition  2.4.  If  the  vertex  x°  is  nondegenerate,  then  Xj  >  0  for  every  j  £  Ib- 

Proposition  2.5.  If  is  an  inner  point  of  G,  then  Xj  >  0  for  every  j  £  Ib- 

Assume  that  we  have  an  6  P(i’)  such  that  9(1°)  >  0  zind  A,  >0  for  every 
j  E  Ib-  Then,  the  convexity  cut 

>  1  (2.8) 

defined  by  (2.6)  cuts  off  but  leaves  the  possible  points  of  ,Y.  If  we  have  Xj  =  0 
for  a  j  £  Ib,  then  a  convexity  cut  similar  to  (2.8)  can  be  also  generated  at  the 
expense  of  some  extra  efforts  including  the  determination  of  the  edges  emanating 
from  x°  and  solving  a  linear  program  (8). 

In  the  latter  case,  an  aJternative  and  faster  way  of  excluding  x°  from  the  further 
search  is  the  generation  of  a  disjunctive  cut.  We  construct  a  cut  of  form  (2.8)  such 
that 

t^x°  <  1  and  t^x  >  1  for  every  x  €  V{P)  \  {1°}.  (2-9) 

Let  /+  =  {»  I  xJ  >  0}.  Then  for  amy  x  6  V{P)  \  {x°},  there  exists  at  least 
one  t  £  /+  such  that  Xi  =  0.  The  disjunctive  cut  is  constructed  based  upon  this 
disjunction. 
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Proposition  2.6.  Consider  a  simplex  tabular  form  (2.3)  belonging  to  and  let 

{max{aij/aio  1  i  €  /+}  for  j  e  Ib, 

ik  =  l,...,n.  (2.10) 

0  otherwise, 

Then  the  cut  (2.8)  defined  by  (2.10)  fulfils  (2.9). 

Assume  now  that  we  have  found  an  E  V{P)  such  that  g{x^)  =  0.  Then  is 
feasible  to  (1.1)  and  (2.1).  Let  N(x^)  denote  the  set  of  those  vertices  of  P  which 
are  adjacent  to  i°.  Examine  whether  there  exists  an  x^  E  V{P)  such  that 

x^  £  N{x^),  g(x^)  =  0  aind  <c^x°.  (2.11) 

If  we  find  such  an  then  we  replace  by  and  repeat  the  matter  above.  In 
this  way,  we  step  on  feasible  solutions  of  (2.1)  meanwhile  improving  the  objective 
function  value. 

After  a  finite  number  of  improving  steps,  we  obtain  an  x°  feasible  to  (2.1)  such 
that  we  cannot  find  an  E  V'(P)  fulfilling  (2.11).  It  may  also  occur  that  there 
exists  such  an  x^  but  x°  is  a  degenerate  vertex  and  we  would  like  to  spare  the  time 
needed  for  determining  N(x^).  We  add  the  objective  function  cut 

c^x  <  7,  (2.12) 

where  7  =  c^x'’,  in  order  to  exclude  the  points  with  objective  function  value 
greater  than  7.  However,  since  x®  fulfils  (2.12),  we  also  generate  a  disjunctive  cut 
presented  above  to  exclude  i®  from  the  further  search. 

After  adding  one  or  two  of  the  cuts  presented  above,  we  proceed  with  a  new 
x°  E  y{P),  if  any,  such  that  x®  fulfils  the  cut  constraints  generated  earlier.  At 
a  step  of  the  algorithm,  let  Q  C  R"  he  the  set  of  the  points  feasible  to  the  cuts. 
Of  course,  Q  is  a  polyhedron.  The  subproblem  to  be  solved  is  now  to  find  a 
point  of  Q  n  V(P)  or  to  prove  Q  0  F(P)  =  0.  This  is  a  well-known  problem  of 
nonconvex  programming.  It  was  treated  first  by  Majthay  and  Whinston  (10)  in  a 
concave  minimization  context.  They  proposed  a  finite  cutting  plane  method  using 
a  parametric  programming  technique.  Their  method  was  improved  and  extended 
by  Fulop  [5]  using  a  disjunctive  prograunming  technique.  We  suggest  applying  the 
finite  method  of  (5)  to  find  a  point  of  Q  H  V^(P),  if  ainy. 

The  cutting  plane  method  proposed  for  solving  (2.1)  is  summarized  below. 

Algorithm  2.1: 

Step  0:  Set  7  ♦-  00  and  Q  *—  P". 


205 


Step  1:  If  Q  n  V{P)  =  0,  stop.  Otherwise,  find  an  x°  6  Q  H  V(P).  If  g(x°)  =  0, 
go  to  Step  2.  Otherwise,  go  to  Step  3. 

Step  2:  If  we  can  find  an  x'  fulfilling  (2.11),  then  set  x°  <—  x^  ajid  repeat  Step 
2.  Otherwise,  set  7  «—  c^x®,  x*  ♦—  x°,  generate  the  disjunctive  cut  (2.8) 
defined  by  (2.10)  and  set-^  <—  Q  fl  {x  6  i?"  |  >  l,c^x  <  7}.  Go  to 

Step  1. 

Step  3:  For  every  j  6  Ib  determine  by  (2.5).  If  \j  >  0  for  every  j  €  /b> 
generate  the  convexity  cut  (2.8)  defined  by  (2.6).  Otherwise,  generate  the 
disjunctive  cut  (2.8)  defined  by  (2.10).  Set  Q  <—  Q  0  (x  6  /I"  |  t^x  >1} 
and  go  to  Step  1. 

Proposition  2.7.  Algorithm  2.1  solves  (2.1 )  in  Unite  steps.  If  7  =  00,  then  (2.1) 
has  not  feasible  solution.  Otherwise,  7  is  the  optimal  value  and  x*  is  an  optimal 
solution  of  (2.1). 

We  turn  now  to  the  case  when  x  is  unbounded  below  over  P.  We  have  to 
check  whether  c^x  is  unbounded  below  over  .V  as  well.  If  (2.1)  has  not  feasible 
solution,  we  axe  done  since  A  =  0.  Otherwise,  let  7  be  the  optimzJ  value  of  (2.1) 
and  choose  a  7  >  7  arbitrarily.  Let  P  =  P  D  {x  G  f?"  |  c^x  =  7}. 

Proposition  2.8.  The  objective  function  c^x  is  unbounded  below  over  X  if  eind 
only  if 

min  {^(x)  I  X  G  P}  =  0.  (2.13) 

It  is  clear  that  (2.13)  holds  if  and  only  if  there  exists  a  vertex  x  of  P  such  that 
g{x)  =  0.  Similarly  to  Algorithm  2.1,  a  finite  algorithm  based  on  convexity  and 
disjunctive  cuts  can  be  proposed  to  verify  (2.13). 

3.  The  minimum  linear  complementarity  problem 

Consider  the  minimum  linear  complementau'ity  problem  given  in  the  form 

min  c^x  s.t.  Ax  =  b,  x  >  0,  x,ifi+,  =  0  for  i  =  l,...,n,  (3.1) 

where  the  sizes  of  matrix  A  and  vectors  x,  b  rmd  c  are  the  same  as  in  the  previous 
sections.  We  assume  that  2n  <  n.  Judice  and  Mitra  [9]  showed  that  several  well- 
known  mathematical  programming  problems  c€m  be  transformed  into  (3.1).  See 
[9)  for  a  list  of  such  problems  and  the  details  of  the  reformulations. 

For  an  X  G  P",  let 

n 

M  =  ^mm{z„Xft+,}. 

1=1 
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Then  g  :  R"  —*  Ria  &  concave  function.  Let  P  =  {x  6  ii"  |  =  6,  x  >  0}.  It  is 

easy  to  see  that  (3.1)  is  equivalent  to  (1.1). 

We  mention  that  the  reformulation  of  some  mathematical  programming  prob¬ 
lems  into  (3.1)  may  result  unrestricted  variables  Xj,  j  =  2n-|-l, . . .  ,n.  The  method 
presented  in  Section  2  can  be:.easily  modified  for  such  problems.  Another  way  is 
to  write  these  unrestricted  variables  as  differences  of  nonnegative  variables. 

4.  The  linear  disjunctive  programming  problem 

A  mathematical  programming  problem  is  called  linear  disjunctive  programming 
problem  if  the  feasible  set  cam  be  represented  by  a  finite  number  of  intersection 
and  union  operations  on  a  finite  number  of  closed  halfspaces.  In  addition,  a  linear 
function  is  to  be  optimized  over  the  feasible  set.  It  can  be  shown  [7]  that  any  linear 
disjunctive  program  can  be  transcribed  into  an  equivalent  problem  of  following 
form; 

min  c^x  s.t.  Ax  =  6,  x  >  0,  JJ  Xj  =  0  for  i  =  1, . . . ,  A:,  (41) 

jeii 

where  li  C  {l,...n}  for  i  =  1,...,!:.  Let  the  concave  function  g  :  ii”  R  he 
defined  by 

k 

M  =  ^min{x;  \jeli}. 

i=l 

Then  (4.1)  is  equivalent  to  (1.1).  It  is  also  easy  to  see  that  (3.1)  is  a  special  case 
of  (4.1)  with  k  —  h  amd  L  =  {t,n  i)  for  i  =  1, . . . ,  k. 

5.  The  linear  bilevel  programming  problem 

Consider  the  lineau:  bilevel  prograimming  problem  (1,4)  stated  ais  follows: 


T*  T 

max  y  +  z,  where  z  solves  (5-1) 

y 

max  -f  (5.2) 

2 

s.t.  A<*)z  =  6,  (5.3) 

y  >  0,  z  >  0,  (5.4) 


where  and  y  are  ni-vectors,  d^*^  and  z  are  nj-vectors,  6  is  am  m- 

vector,  is  an  m  x  ni  matrix  amd  is  an  m  x  nj  matrix.  Let 

amd  A  =  . 


X  = 


,  c  = 


42) 


,  d  = 


d<» 

d(2) 
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Let  P  =  {x  e  I  jix  =  6,  X  >  0}.  We  assume  that  P  ^  0  and  dTx  is 

bounded  above  over  P.  For  an  x  G  let 

g{x)  =  — r  +  max-id'^'  5  |  =  b  —  z  >  0},  (5-5) 

where  g{x)  =  — oo  if  the  maximization  problem  in  (5.5)  has  not  feasible  solution. 

Let  G  =  {x  £  I  g{^x)  >  — oo}.  It  c^ul  be  shown  that  G  is  convex,  P  C  G, 

g  is  &  piecewise  linear,  continuous  and  concave  function  over  G  and  ^(x)  >  0  for 
every  x  G  P.  In  addition,  (5.1)-(5.4)  is  equiv£dent  to  (1.1)  with  the  modification 
that  c^i  is  to  be  maximized  now. 

6.  Linear  optimization  over  the  efficient  set 

Consider  the  multiple  objective  lineeu"  progrzun 

'mzLx'  Ci  s.t.  I  G  P,  (6.1) 

where  C  is  a  A:  x  n  matrix  and  P  C  P"  is  nonempty  polyhedron.  By  definition  [11], 
a  point  1°  G  P  is  an  efficient  solution  of  (6.1)  if  and  only  if  there  exists  no  x  G  P 
such  that  Cx  >  Cx°  and  Cx  Cx°.  Let  E{P)  denote  the  set  of  the  efficient 
solutions.  Consider  the  problem 

min  c^x  s.t.  i  G  P(P),  (6.2) 

where  c.  is  an  n-vector.  Problem  (6.2)  has  several  applications  in  multiple  objective 
programming,  see  [2,3,11]  and  the  references  therein. 

For  an  I  G  P",  let  g{x)  be  defined  by 

^(i)  =  max  e^C(y  —  x)  s.t.  Cy  >  Cx,  y  £  P.  (6.3) 

where  g(x)  =  -oo  if  (6.3)  hats  not  feaisible  solution  and  g(x)  =  oo  if  the  objective 
function  is  unbounded  above  over  the  nonempty  feasible  set  of  (6.3).  In  (6.3),  e  is 
the  vector  whose  every  component  is  equal  to  1. 

Let  G  =  {x  G  P"  ]  g{x)  >  — oo}.  Cleau-ly  P  C  G.  It  can  be  shown  that 
E(P)  ^  0  if  auid  only  if  g{x)  is  finite  for  every  x  G  G.  Assume  that  E{P)  ^  0. 
Then  y  is  a  nonnegative,  piecewise  lineau,  continuous  and  concave  function  over 
G.  In  addition,  problem  (6.2)  is  equivalent  to  (1.1). 
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EXTENDED  ABSTRACT 

We  consider  here  the  following  stochastic  programming  problem: 

minimize  E<af(x,a>)  (1) 

subject  to  p(x)  ^0,  x£X 

and,  more  specifically,  stochastic  linear  program  with  recourse 
([WETS66],  [BIRG861,  [KALL76],  [PREK73])  which  is  the  problem  ( 1)  with 
f(x,<o)  =  c^(<u)x  +  Q(x,tu),  p(x) = Ax-b  and 

Q(x,  to)  =  min  {q^(<w)y  |  W(£o)y  =  h(to)-T(co)x}  (2) 

where  Eat  denotes  expectation  with  respect  to  co,  an  element  of  some 
probability  space  (S2,  B,  P).  We  assume  complete  recourse,  i.e.  (2)  always 
has  a  solution. 

Several  methods  for  solving  this  problem  which  combine  Dantzig-Wolfe 
decomposition  and  statistical  techniques  were  proposed  recently 
([HIGL91],  [GAIV89]).  The  common  feature  of  these  methods  is  the 
necessity  to  solve  on  each  iteration  linear  or  quadratic  programming 
problem  which  can  be  of  considerable  dimension. 

In  this  paper  we  continue  research  in  the  direction  of  [GAIV89]  and 
propose  a  specific  algorithm  for  solution  of  problem  ( l)-(2)  which  combines 
stochastic  quasigradient  techniques  with  generalized  linear  programming. 
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Let  us  explain  in  more  details  the  algorithm  we  use  to  solve  problem 
(l)-(2).  The  generalized  programming  technique  of  Wolfe  [DANT80] 
applied  to  (1)  involves  grid  linearization  of  Q(x)  =  E(uQ(x,  oj\  that  has  been 
shown  to  be  convex  [WETS66],  and  requires  coordinated  solution  of  a 
master  program  and  a  Lagrangian  subproblem  defined  as  follows: 


Master: 


k  k 

min  2  cjkj  +  2  Q{j)kj 
j=l  /*i 

s.  t. 

.  * 

=  b 

.  k 

=1 


(3a) 


;=i 
Aj^O 

Where  v*'  are  the  dual  multipliers  associated  with  the  optimal  solution 

of  (3a)  and  c  =  E<wc(x,  m). 

Subproblem:  Find  x^ ^GR“  such  that  l^x*^"^  u  and 

+  (3b) 

by  partially  optimizing  the  problem: 

(3c) 


where  s  (c  -  A^;r*^). 


min,  o\  +  Q(x) 
isx^u 


Hence,  the  essential  features  of  the  method  consist  of  sending  the  prices 
of  the  master  to  the  subproblem  that  uses  them  to  identify  an  improved 
solution  depending  on  the  previous  points  x^..jc^  In  this  way,  a 
sequence  of  points  x^ ...  x^  is  generated  by  the  algorithm,  which  converges  to 
the  solution  of  the  problem  (1)  in  a  certain  probabilistic  sense. 

The  problem  is  that  it  is  not  possible  to  compute  the  values  of  Q(x;')  and 
its  sugradients  exactly,  except  in  some  rare  cases  [NAZA86].  More  generally, 
□(x^)  can  be  approximated,  for  example  by  a  sample  procedures.  What  we 


do  is  to  replace  Q(x>)  by  an  estimate,  say  Qfe.  This  substitution  does  not  affea 
the  solution  of  the  Master  problem  too  much  and  for  the  Subproblem  it  is 
not  relevant  if  we  do  not  optimize  at  each  iteration,  but  we  just  look  for  a 
point  that  satisfies  relation  (3b).  Tliis  suggests  to  use  statistical  techniques 
for  the  subgradient  estimation  in  order  to  find  the  next  improving  point 
In  particular,  we  use  stochasic  quasi-gradient  procedures  because  of  its 
effectiveness  in  solving  problems  that  has  not  to  be  pushed  all  the  way  to 
optimahty  ([ERM076],  [GAIV88]). 

The  fundamental  steps  of  the  proposed  algorithm  are  illusired  below. 

STEP  1:  (Initialize) 

Choose  a  set  of  m  grid  points  x\...,x“  so  that  constraints 

2  (^My  =  b 
/=i 

2  Ay  =1  (4) 

y=i 

AjsO 

have  a  feasible  solution.  The  simplest  way  to  find  them  is  to  fix  the  values 
of  and  start  to  solve  the  problem,  of  minimization  of  the  objective  function 
k 

X  cx'Ay,  for  example  using  the  simplex  algorithm.  The  basic  solutions  found 

y=i 

at  the  first  m  iterations  may  be  a  good  choice  for  initial  points. 

Set  k*-m. 

STEP  2:  (form  estimates) 

Define  a  subset  Nk  of  integers,  Nk  C  { l,...,k},  which  are  the  indices  of  the 
set  of  grid  points  for  which  the  estimates  will  be  made,  and  the  number  s(k) 

which  controls  the  precision  of  estimates  gfe  of  Q(x*),  in  the  following  way: 
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-  define  a  sequence  ,  with  kp+i>kp,  and 

IF  k = kp(for  some  kp)  THEN  Nk  =  { l,..Mk} 

r^(k)=s(k-l)  +1 
ELSE  Nk  =  {k} 

s(k)  =  s(k-l) 

Elements  of  kp  define  the  iterations  in  which  estimates  have  to  be  updated 
at  all  of  the  available  grid  points.  During  all  other  iterations  only  the 
estimation  at  the  last  point  is  performed. 

-  with  j  G  Nk,  is  updated  by: 

IFk  =  mTHEN(Vj  <  k) 
so 

I  ■  • 

Qic  -  —  where  co*  are  indipendend  observations  of  o). 

J0,_i 

s(k)  .  . 

^  = 
forjVk. 

-  The  estime  at  the  last  grid  point  to  enter  in  the  set  is  made  as: 

/  1  \  , 

Such  estimates  have  the  property  that  [Q(x*)  ] = ck-^0  a.s.,  as 

s(k)-^<». 

STEP  3:  (solve  Master) 

-  Solve  the  master  problem  (3a)  in  order  to  obtain  the  value  of  the 
dual  multpliers,  and  A  and  of  the  primal  variables  A/  . 
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At  this  point  the  set  Ak  =  {y.^  >  0}  can  be  defined  such  that  it  is  possible 
to  redefine  the  set  Nk  avoiding  to  update  the  estimates  in  those  points  having 
j^SAk. 

s  iEP  4:  (Define  a  new  grid  ik>int 

-  Define  t7^= (c-AT;r*' )  and  consider  the  Lagrangian  subploblem  (3c). 

-  Fix  the  number  Sk  of  iterations  and  compute  a  sequence  of  points 
using  the  stochastic  quasi-gradient  method  as  follows.  Here  ds  are 
the  optimal  dual  variables  associated  with  solution  of  the  problem 

(2)forx=jti. 

^  =ak-TV)ds 
Jti'‘‘^  =  Px(xi-/>s^  )  with 
s=0,...,Sk-l,  x^=xj^  and 

where  Px  is  the  projection  operator  over  the  set  X  of  points  belonging  to 
the  feasible  region. 

STEPS:  (Iterate) 

k  k  -I- 1.  Return  to  Step  2. 

It  is  important  to  note  that  it  has  been  possible  to  apply  the  generalized 
linear  programming  to  stochastic  problem  introducing  two  important 
modifications,  mentioned  above,  to  the  original  decomposition  method. 
Firstly,  it  does  not  require  exact  values  of  the  objective  function  (step  2).  It 
is  only  necessary  to  have  estimates  of  the  objective  values  at  the  grid  points 
with  their  precision  gradually  increasing.  Secondly,  it  is  not  necessaiy  to 
minimize  the  Lagrangian  subproblem  at  step  4,  precisely,  it  is  only  necessary 
that  current  point  x  regulary  comes  to  the  vicinity  of  such  a  solution. 

The  convergence  of  this  algorithm  to  the  set  of  optimal  solution  of 
problem  (l)-(2)  with  probability  1  follows  from  results  contained  in 
(GAIV89]. 
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Methods  using  the  decomposition  techniques  have  as  common  feature 
the  necessity  to  solve  at  each  iteration  a  large  scale  linear  problem.  In  order 
to  speed  up  the  computational  process  we  have  paid  much  attention  to 
preprocessing  of  linear  programming  subproblems.  In  particular,  we 
implemented  this  algorithm  by  calling  each  time  a  processing  routine  and  a 
linear  programming  solver  of  the  Optimization  Subroutine  Library  (OSL) 
[OSL92].  This  allows  to  reduce  greatly  the  problem  dimentions  and  to  take 
advantage  from  similarity  existing  among  problems  arising  at  different 
iterations. 

Results  of  numerical  experiments  are  reported  in  the  full  paper. 
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This  paper  is  concerned  with  the  multistage  stochastic  linear  programming  problem, 
written  in  deterministic  equivalent  form  as 

Ki  Kt  Kt 

E  +  E  +•  ••+  £  Ptr^r^kT 

==  bo 

AkiXki  =  it,,  =!,•••, •K'l 

^kiX«i{ki)  +  AktXki  =4*2 »  ki  =  Ki  +1,...  ,K2 

■  (1) 
■^*7'^tt(*7')  Akj^ Xkj-  “  j  kj'  —  Ky* — 1  "1“ 

X*  >  0,  fc  =  0, . . . ,  Kt- 

Nested  Benders  decomposition  (Birge  [1],  Gassmann  [2])  splits  this  problem  into  ifr+1 
pieces,  one  for  each  node  in  the  decision/event  tree.  Each  of  these  subproblems  takes  the 
form 


min  c^xq  + 

s.  t.  AoXo 
•0*1*0  + 


min  CkXk  +  ^k 


s.t.  AkXk 
FkXk 

GkXk  +  It?* 

x* 


=  bk—  BkXa{k) 
</* 

<  9k 

>  0. 


(2) 


Here  (Fk,fk)  defines  ftasihUiiy  cats,  generated  by  the  subproblems  beyond  k’s  time  stage 
to  ensure  their  feasibility,  and  {GktSk)  axe  opiimaliiy  cuts  (1  is  a  column  of  ones)  which 
cut  off  non-optimal  parts  of  k’s  feasible  region. 
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As  the  number  of  stages  increases,  information  has  to  travel  through  more  and  more 
intermediate  points  to  get  from  the  first  stage  to  the  horizon  and  back,  and  one  would 
expect  this  process  to  be  quite  time-consuming. 

The  fast-start  method  described  in  this  paper  selects  one  scenario,  that  is,  a  single 
path  &om  the  root  of  the  event  tree  out  to  one  of  the  leaf  nodes  and  solves  it  as  a  regular 
LP  of  reasonable  size.  If  the  scenario  represents  an  “average”  set  of  realizations,  then  one 
would  expect  the  optimal  decision  variables  for  the  stochastic  problem  to  be  “near”  the 
optimum  values  for  the  scenario  problem. 

In  other,  words,  the  scenario  solution  can  be  used  as  a  reasonable  starting  point  for  the 
other  problems.  The  important  distinction  between  this  approach  and  scenario  aggregation 
and  the  progressive  hedging  algorithm  of  Rockafellar  amd  Wets  [4]  is  that  in  the  present 
approach  a  scenairio  is  not  an  indivisible  unit  but  is  merely  seen  as  a  means  to  an  end, 
namely  to  find  good  starting  bases  for  the  node  problems  (2). 

The  main  difficulty  lies  in  disaggregating  the  scenario  solution  into  the  different  stages. 

Let’s  look  at  the  two-stage  problem  first.  We  can  assume  without  loss  of  generality 
that  the  first  scenario  consists  of  nodes  0  and  1 ,  aind  we  can  separate  the  optimal  solution 
I*  into  five  different  components  (an  optimal  solution  must  exist  if  the  overall  problem  (1) 
is  to  have  a  solution); 

■  ^OB  are  first  stage  columns  which  are  basic  in  first  stage  rows 

-  are  first  stage  columns  which  are  basic  in  second  stage  rows 

-  Xgy  are  first  stage  columns  which  are  nonbasic  (and  have  value  0) 

-  Xjg  are  basic  second  stage  columns 

-  ij/,/  are  nonbasic  second  stage  columns 

It  is  clear  from  the  problem  structure  that  all  the  components  of  x*g,  must  be  basic 
in  second  stage  rows.  Moreover,  if  s  =  |S|  is  the  number  of  components  in  the  second 
group,  then  we  need  s  -h  1  cuts  to  force  the  solution  of  the  node  problem  (2)  to  agree  with 
the  first  stage  decisions  of  the  scenario  problem.  These  cuts  are  derived  from  the  second 
stage  problem 


min  CiZ] 

s.  t.  Aiii  =  bi  —  Bixo  (3) 

H  >  0, 

using  To  =  Zg  and  the  2s  perturbations  Zg  ±  for  t  €  5  and  some  step  length  S.  At 
most  one  of  the  two  directions  will  yield  a  cut  to  the  node  0  problem,  and  the  cuts  can 
be  both  optimality  and  feasibility  cuts.  The  starting  basis  fw  solving  (3)  is  in  all  cases 
defined  by  x*g,  augmented  with  slacks  corresponding  to  the  deleted  variables  Zg5. 

However,  the  cuts  in  node  0  are  only  valid  for  the  single  scenario  problem,  so  optimality 
cuts  have  to  be  updated  (“peeled  back”  in  the  language  of  Higle  and  Sen  [3])  to  reflect 
contributions  from  the  other  problems. 

There  are  several  ways  to  perform  the  update. 

A.  If  the  uncertainty  is  in  the  right  hand  side  only  and  if  the  solved  scenario  corresponds 
to  the  mean  value  of  the  realizations  (i.e.  Pi^i))  valid 

without  modification.  (This  is  a  simple  application  of  Jensen’s  inequality.) 
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B.  If  Li  is  a  lower  bound  on  the  objective  value  for  the  node  problem  at  node  t,  for 
t  =  1 . . .  ifi ,  then  the  optimality  cut 

ttJBjio  +  i9o  < 


must  be  adjusted  to  read 


PiTTjBiXo  +  do  <  +  y^PiLj. 

1=2 

C.  If  IT*  are  dual  feasible  solutions  for  all  the  second  stage  node  problems,  then  the 
optimality  cut  (4)  can  be  adjusted  to 


Kt  Kx 

^ PiTT* Biio  +do  <  ^Pi<6i 

t=l  i=l 

This  form  is  obviously  tighter  than  B.  but  involves  more  work.  If  the  A-matrices  and 
cost  coefficients  are  deterministic,  it  is  of  course  permissible  to  use  throughout. 

D.  A  simplified  version  of  the  algorithm  dispenses  with  creating  the  cuts  entirely  and 
simply  throws  away  information  about  the  “superbasic”  variables 

Multistage  problems  are  similar;  the  schematic  decision  tree  of  figure  la.  may  serve 
as  an  example  with  four  stages. 


Figure  la.  A  four-stage 
decision  tree 


Figure  lb.  A  two-stage 
lower  bound  problem 


The  scenario  problem  (I  2  3  4)  is  solved  first  and  cuts  based  on  the  second  stage  (2  3  4) 
are  created  and  updated  to  be  valid  for  the  eight-scenario  problem  indicated  in  figure  lb. 
Because  the  value  of  information  is  always  nonnegative,  this  defines  a  lower  bound,  and 
hence  the  cuts  will  also  be  valid  for  the  configuration  of  figure  la. 

The  optimal  basis  can  be  copied  to  the  scenario  problem  (3  6  12),  then  both  are 
separated  out,  the  cuts  are  created  and  updated  so  as  to  be  valid  for  each  of  the  four 
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scenario  problems  lying  downstream  from  nodes  2  and  3,  then  one  works  on  the  scenarios 
(4  8),  (5  10),  (6  12)  and  (7  14)  in  the  same  fashion,  and  finally  one  arrives  at  the  fully 
separated  problem  with  15  nodes  used  in  the  nested  Benders  decomposition. 

It  is  possible  for  one  of  the  subsequent  scenario  problems  (3  6  12),  (5  10),  (6  12), 
(7  14)  to  be  infeasible.  In  that  case,  no  optimahty  cuts  should  be  derived  from  this 
scenario  (or  any  of  the  subsequent  scenarios  imbedded  in  them),  which  may  necessitate 
some  modification  to  the  updating  of  cuts  if  option  C.  is  used. 

Preliminary  results  on  a  set  of  standard  test  problems  show  a  consistent  reduction  in 
CPU  time  of  between  10  and  30%  over  standard  nested  decomposition  methods. 
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Some  connections  between  integer  programming 
and  continuous  optimization 
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Abstract. 

The  purpose  of  this  paper  is  to  stress  the  importance  of  analyzing 
the  relationships  between  combinatorial  and  continuous  optimization.  In 
order  to  attract  more  attention  to  this  field  some  recent  and  less  recent 
invesitgations  will  be  touched. 

First  of  all  we  discuss  an  equivalence  property  between  two  extremum 
problems  of  type: 

(1)  min/(x)  ,  X  e  Z  n  i?, 
and 

(2)  min[/(x)  +  /iv?(x)]  ,  x  e  A  n  i?, 

where  Z  C  X  represents  a  relaxation. 

This  property  is  used  to  establish  an  equivalence  between  a  nonlinear 
integer  programming  problem  of  type: 

(3)  min/(x)  ,  p(x)  >  0  ,  x  e  {0, 1}" 


and  a  continuous  nonlinear  programming  problem  of  type: 

(4)  min[/(x)  +  ^x^(e  -  x)]  ,  ^(x)  >  0  ,  0  <  x  <  e. 
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where  e^  :=  (1, . . 1).  A  quadratic  objective  function  is  considered  as  a 
special  case. 

By  means  of  the  preceding  relationships  it  is  easy  to  connect  an 
integer  programming  probleni  with  a  complementarity  problem  in  the 
form 

(5)  >  0  1  3:  >  0  ,  {F{x),x)  =  0, 
with  a  variational  inequality  in  the  form 

(6)  {G(x),  y~x)  >  0  ,  Vy  6  A', 
and  with  a  fixed-point  problem  in  the  form 

X  =  ${x), 

where  the  maps  F,  G,  $  and  the  set  K  are  defined  by  the  data  of  (3). 

The  results  which  connect  (1)  and  (2)  can  be  generalized  and  ex¬ 
ploited  to  close  the  duality  gap  when  the  Lagrangean  relaxation  is  applied 
to  a  facial  constraint. 

Some  remarks  are  made  about  the  resolution  of  problem  (4)  and  of 
variational  inequality  (5). 

In  order  to  deepen  the  relationships  between  combinatorial  and  con¬ 
tinuous  optimization  we  introduce  the  concept  of  image  of  a  constrained 
extremum  problem.  If  this  is  given  in  the  form 

(7)  minv?(a:)  ,  y(x)  >0  x  €  X,  (v? :  X  — ^  R;  y  :  X  — »  R"*), 
then,  given  any  x  €  X,  the  im£^e  of  (7)  is  the  set 

/C  :=  {(u,  v)  6  R  X  R"*  :  u  —  (p{x)  —  y?(x)  ,  v  =  y(x),  x  €  A}. 
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Now  the  optimality  of  x  is  reduced  to  the  disjunction  of  K.  and  the  set 

H  :=  {(u,  u)  €  R.  X  R"*  ;  u  >  0  ,  v>  0}. 

Separation  arguments  or  afternative  ones  can  be  used  to  show  Hfl/C  =  0. 
The  fact  that  separation  and  alternative  may  be  considered  equivalent  but 
different  languages  for  expressing  the  same  thing  enables  one  to  reduce, 
to  the  same  scheme,  very  different  problems,  for  instance,  problem  (6). 
If  K  is  given  in  the  following  form: 

K  :=  {!,  6  X  :  g{y)  >  0), 

where  g  :  X  -*  R"*,  then  obviously  x  e  K  solves  (6)  iff  the  system  (in 
the  variable  y) 


{G{x),  X  ~  y)  >  0  ,  g{y)  >  0  ,  y  e  X 
is  impossible.  Hence  it  is  easy  to  introduce  the  image  of  (6)  as  the  set 
IC{x)  :=  {(u,  u)  6  R  X  R"  :  u  =  {G{x),  x  -  y),  v  =  g{y),  y  €  X}. 
This  leads  us  to  associate,  to  problem  (6),  the  following  “gap  function” 
xl^ix)  :=  imnmax((G(x),  x  -  y)  +  7(y(y);  w)], 

y€A 

whore  7  :  R  xH  — ♦  R  belongs  to  a  wide  class  of  functions  (called  sepa¬ 
ration  functions),  which  includes  the  linear  ones,  and  represents  the  gen¬ 
eralized  Lagrange  multipher  approach.  The  crucial  properties  of  tp  are: 
’/’(^’)  >  0  Vx  and  tp{x)  =  0  iff  x  solves  (6).  Hence  we  have  “two  ways  con¬ 
nection”  between  optimization  problems  and  variational  inequalities.  An 
interesting  application  of  this  is  to  the  study  of  equilibrium  in  a  network. 
In  some  real  cases  (6)  is  a  better  model  than  (7),  since  the  equilibrium 
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is  better  interpreted  by  well  known  Waxdrop  principle.  In  this  case  the 
above  equivalence  may  enable  us  to  exploit  the  methods  of  combinatorial 
optimization  for  solving  a  “coptinuous  model”  like  (6). 
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New  Results  for  A^regating  Integer>  Valued 
Equations 

FRED  GLOVER  -  School  of  Business,  University  of  Colorado, 

Boulder,CO  80309-0419,  USA 

DJANGIR  BABAYEV  -  US  West  Advanced  Technologies,  4001  Discovery  Drive 
Boulder,  Colorado  80303,  USA 

§1.  Introduction 

Our  goal  in  this  paper  is  to  provide  new  theorems  for  aggregating  general 
integer-valued  equations  that  can  be  shown  to  inq)ly  useful  analytical  results.  We  give 
theorems  that  provide  new  methods  for  aggregating  equations,  and  show  that  each  yields 
significant  earlier  results  as  special  cases. 

A  number  of  references  have  been  devoted  to  identifying  rules  for  aggregating 
equations,  which  determine  integer  -  valued  weights  for  the  equations  so  their  linear 
combination  yields  a  single  equation  with  the  same  nonnegative  integer  solution  set  as  the 
original  collection.  To  emphasize  this  rigorous  equivalence  between  the  original  system 
and  the  corresponding  single  equation  we  refer  to  this  outcome  as  "integer  equivalent 

aggregation "  or  DBA.  The  coefficients  of  the  aggregated  equation  tend  to  become 
exceedingly  large  as  the  number  of  original  equations  increases,  and  hence  it  is  desirable  to 
identify  weights  so  these  coefficients  will  lie  in  a  range  as  limited  as  possible.  Babayev  and 
Mamedov  [2]  have  derived  a  novel  result  for  integer-valued  equations  whose  right  hand 
sides  are  equal  to  1,  yielding  what  have  been  conjectured  to  be  the  smallest  possible 
weights.  Later  this  result  was  extended  by  Knyazev  [5]  to  the  case  where  right  hand  sides 
are  equal  to  a  common  integer  value  b  ^  1.  Our  new  results  subsume  these  earlier  results 
and  also  give  a  variety  of  additional  ways  to  aggregate  equations. 

_  §2.  Notation  and  General  Results 

Let  N  =  1.2,....n,  and  X  be  a  subset  of  the  nonnegative  n  -dimensional  integer 
vectors  (as  possibly  constrained  by  additional  inequalities  or  equalities  of  interest),  and 
consider  the  equations 

Ackaowledgment  This  research  is  supported  in  pan  by  the  Joint  Air  Force  Office  of  Scientific  Research 
and  Office  of  Naval  Research  Contiaa  Nq  49620-90-C-0033,  at  the  University  of  Colorado. 
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(1) 

II 

o 

xeX, 

(2) 

xeX, 

where  the  summations  are  over  components  of  x ,  and  the  coefficients  and  Ojy  are 
integen,  not  necessarily  nonnegative.  Define  a,y  =  w^a^J  +W2a2j,  where  w,  and  Wjare 
integer  weights  of  the  arbitrary  sign.  Then  we  seek  conditions  under  which  the  equation 

(3) 

has  the  same  solution  set  as  (1)  and  (2).  A  collection  of  equations  can  thereby  be 
aggregated  iteratively  taking  (3)  in  the  role  of  (1)  and  letting  each  successive  equation 
of  the  collection  (except  the  first )  take  the  role  of  (2). 

Let  X‘  and  be  two  supersets  of  X  (i.c.,  X'  ^X  and  X^  ^X  )  and 
consider  the  two  related  equations : 


(la) 

• 

o 

II 

H 

X€X\ 

i*ff 

(2a) 

xeX\ 

JtN 

Typically  (la)  and  (2a)  each  imply  a  number  of  linear  inequalities  in  nonnegative 
coefficients,  as,  for  example,  simple  upper  bounds  on  some  components  of  x.  In  general, 
we  will  represent  any  of  these  inequalities  implied  respectively  by  (la)  and  (2a),  as 


(!•) 

xeX\ 

(2*) 

J^byXj 

xe 

/•« 

where  coefficients  byj  and  bjy  are  assumed  nonnegative,  though  not  necessarily  integer. 


Our  first  major  result  is 
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Theorem  1.  For  k -1,2  let  xeX*:  Xj=0  for  each  j^N  such 

that  Xj  is  bounded  from  above  by  xeX*  }  and  let  w,  and  tVj  be  relatively  prime 
integers  .Then  (3)  is  equivalent  to  (1)  and  (2)  if 

W  =  for  some  x' e  X^  that  violates  (  2" ).  Le.. '^^b^jXj  >  bj, . 

y<M  >cN 

(5)  for  some  x''eXl  that  violates  {V)  ,i.e.,'J^b^jX">byf^.  ♦ 

yaM  j»N 

Our  second  result  makes  use  of  upper  and  lower  bounds  on  the  left  hand  sides  of 
the  equations  (1)  and  (2). 

We  introduce  a  collection  of  eight  inequality  conditions  by  the  following  notation: 


c*  • 

-w^  >  t/j -a^  ; 

C- : 

a 

1 

A 

c*  • 

'-It  • 

w,  >-{L^-a^)  ; 

C^l  : 

-w^>-{L^-aJo)  ; 

(6) 

c*  ■ 

'-2a  • 

Wj  >  C/,  —  0,5  ; 

-wj>  ; 

c*  • 
'-2/  • 

C21 : 

a? 

V 

0 

where 


(7)  (/.  =  max(max^a^A:^,  a.o),  L,  =  min(min  a,o),  i  =  1,2. 

'**  yaw  '•*  /aW 

Each  of  the  inequalities  of  (6)  is  strict,  with  nonnegative  right  hand  sides,  so  they 
provide  lower  bounds  for  the  absolute  values  of  the  multipliers  w^  and  w,.  The  symbols 
u  and  I  in  (6)  refer  to  inequalities  based  on  upper  and  lower  bounds,  respectively. 
Allowing  i  to  take  the  values  1  and  2,  (7)  determines  f/,  and  L,  by  reference  to 
equations  (1)  and  (2).  If  we  employ  the  conventions 

+  o  1,  «  o  1,  o  C/j, 

then  the  conditions  of  (6)  can  be  succinctly  represented  in  the  composite  form 

(8)  C':  /,p,je{l,2,3  } 

Theorem  2.  Equation  (3)  is  equivalent  to  the  system  (1)  -  (2)  if  w^  and 

Wj  are  relatively  prime  integers  that  sati^  any  pair  of  conditions  (  C^,  C^),  J.e.. 

(C*.  C^),  when  the  following  relation  is  valid 
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(9)  or  r. 

If  i  =  j  in  the  pair  (  C^,  C^),  then  can  be  given  an  arbitrary  integer  value, 
relatively  prime  with  w-.  ♦ 

Note  1.  For  aggregating  it  is  not  necessary  that  the  system  (I)  -  (2)  be  consistent  The 
theorems  of  this  paper  are  also  valid  for  systems  that  are  inconsistent  i.e.,  that  have  an 
empty  set  of  solutions. 

In  computational  practice  it  may  be  more  important  to  obtain  a  smaller  maximum 
for  the  absolute  values  of  the  coefficients  of  the  aggregated  equation  and  its  right  hand 
side  than  to  restrict  the  size  of  multipliers  w,  and  w^ .  In  some  cases  this  can  be  achieved 
by  increasing  the  absolute  value  of  one  of  the  multipliers. 

Note  2.  The  proof  of  Theorem  2  discloses  that  the  validity  of  this  theorem  requires  the 
left  hand  sides  of  equations  (1)  and  (2)  to  take  integer  values,  which  is  crucial  for  the 
proof.  On  the  other  hand,  the  proof  does  not  rely  on  the  signs  of  the  variables  or  their 
integrality,  or  on  the  linearity  of  the  left  hand  sides  of  the  aggregated  equations  (1)  and 
(2).  That  means  that  there  is  an  analog  of  Theorem  2  that  is  valid  for  establishing  integer 
equivalent  aggregations  of  systems  of  a  more  general  nature. 

Let  Y  be  a  set  of  an  arbitrary  structure  and  let  s  5.  (y),  /  =  1,  2  be  real 

integer  -  valued  functions  defined  for  arguments  yeY.  Consider  a  system 

(la)  5,(y)  =  a,o,  yef, 

(2a)  5j(y)  =  aa,  yeK, 

and  define 

(7a)  (/;  =  max(max  5;  (y),  a,o ).  4  =  min  (min  5,.  (y),  a,o ),  i  =  1. 2. 

r«r  r*r 

Then  we  may  state 

Theorem  2a.  The  equation 
(3a)  w,S,  (y )  +  WjSj  (y)  =  0,0 

is  equivalent  to  the  system  (la) -(2a)  if  w^a^d  Wj  are  relatively  prime  integers  that 
satisfy  any  pair  of  conditions  (C^,  Cj.),  i.e..  {C*.  Cf),  and  relation  (9)  is  valid. 

If  i  =  j  in  the  pair  (  C^,  C*),  then  Wj_  can  be  given  an  arbitrary  integer  value, 
relativelv  prime  with  w-.  ♦ 
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Note  3.  In  the  proof  of  Theorem  2  the  values  and  Z,  were  used  for  determining 
lower  bounds  for  the  absolute  values  of  the  multipliers  w,  and  Wj .  These  bounds  in  general 
can  be  relaxed  by  obtaining  smaller  values  for  and  larger  values  for  I, ,  z  =  1, 2. 

Define 

(10)  i/;  =max(max5i,  a.o)  s-L  (-1)“.^  z  =  l,2. 

Jl€X 

» 

where  a  =  l-z+s+[(l+jgn(Wi)]/2,  and  the  symbols  i  and  s  refer  to  indexes  of 
the  condition  ^  (according  to  (8)  £/,  is  contained  in  condition  ).  The  same 
constraints  apply  also  for  determining  Z,.  i  =  l,2, 

(11)  Z,  =min(min5.,  s.t.  (-l)“Si^  <(-I)“[flj_io -(-0“^5rt(W;)w,].  z  =  l,2. 

Then  the  following  theorem  is  valid 
Theorem  2b. The 

(3a)  w^S^  (y) + Wj5,(y)  =  Oy, 

is  equivalent  to  the  system  (la) -(2a)  if  Wi  and  Wj  are  relatively  prime  integers  that 
satisfy  any  pair  of  conditions  (  C^,  C\),i.e.,  (  C^,  C~),  when  Uf  and  L,  are 
determined  by  (10)  and  {ll),  respectively,  and  relation  {9)  is  valid. 

If  i  =  j  in  the  pair  (  C^,  Cp,  then  can  be  given  an  arbitrary  integer  value, 
relatively  prime  with  w..  ♦ 

Comment  Theorems  2, 2a  and  2b  integrate  some  ideas  expressed  in  [3],  [4]  and  [6]. 

In  Theorem  1  the  stipulation  on  w^  was  used  to  rule  out  z;  >  0 .  The  case  ^  <  0 
was  eliminated  by  the  condition  on  In  Theorem  2  conditions  C*  and  C~, 
respectively  serve  the  same  purposes.  From  this  observation  it  follows  that  the  aggregation 
will  be  valid  also  under  the  proper  combination  of  the  conditions  of  Theorem  1  and 
Theorem  2.  This  conclusion  constitutes  the  content  of  the  following  result 

Theorem  3.  /ifw,  and  Wj  are  relatively  prime  integers,  then  equation 
(3)  or  (3a)  is  equivalent  to  system  (1)  -  (2)  or  (la)  -  (2a),  respectively,  if  any  one  of 
conditions  C*  replaces  the  stipulation  on  w^  in  Theorem  Lor  if  any  one  of 
conditions  C~  replaces  the  stipulation  on  w^  in  Theorem  1 .  ♦ 
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_  §3.  Application  to  a  Special  Class  of  Equations. 

We  will  now  show  how  Theorems  1  and  3  can  each  be  applied  to  generate  the 
special  results  of  [3]  for  aggregating  the  system  of  equations 

(12)  x,.=h.  ye/V.,- 

where  b  is  a  positive  integer.  (In  the  case  we  have  the  result  of  [2].) 

The  interest  in  aggregating  this  system  is  that  the  Xj  variables  can  be  replaced  by  any 
functions  fj{x)  that  have  nonnegative  integer  values  for  xeX,  thereby  making  it 
possible  to  replace  the  equations 

fi(x)=b,  yeAf, 

by  a  single  equivalent  equation.  We  state  the  corresponding  result  of  [S]  in  an  equivalent 
but  slightly  different  form 

Theorem  4(5].  Define 

(13)  dj*‘=[(b+l)--(6+l)-n/6.  yelV, 

4j*>=((/ib-l)(b+l)-  +  l]/b. 

Then,  for  n>l  the  equation 

(14) 

jmH 

has  the  unique  solution,  given  by  (14),  when  the  Xj  variables  are  constrained  to 
nonnegative  integers.  ♦ 

To  show  that  Theorem  4  is  actually  a  consequence  of  Theorem  1,  we  take 
X  ={x>0  and  integer  } ,  X'  =X^  =X  and  make  use  of  the  following  identities: 

(i)  dy^=i(b+iy-i]/b, 

(ii)  dj*’ =(6+l)dj*^*,  for  lSy<q 

It  is  easy  to  show  that  Theorem  1  yields  (14)  for  n  s  2.  As  (1)  and  (2)  consider 
the  following  equations 

(15)  X,  =b, 

(16)  Xj^b. 
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The  definitions  of  X  and  X*  imply  X*=X*,  :=  1, 2.  Equations  (15)  and  (16)  imply 

the  following  inequalities  (15')  and  (16'),  which  we  take  as  (1*)  and  (2'): 

(15')  x^^b, 

(16')  x^<b. 

The  smallest  value  of  Wj  satisfying  (4)  is  obtained  by  taking  x^=b+l,  which  implies 
WiS^+l.  Then  the  smallest  value  of  consistent  with  condition  (5)  and  relatively 
prime  with  w^  is  obtained  by  assuming  x^=b  +  2,  which  implies  Wj  =6  +2. 

These  values  of  w^  and  lead  to  the  aggregating  equation  (as  (3)) 

(b  +  l)x^+ib  +  2)Xj  =  [{lb -l)ib  +  lf  +  l]/b, 
which  is  (14)  for  n  =  2. 

In  general,  using  mathematical  induction,  we  take  ( 1)  to  be  ( 14)  for  n  =  q  - 1,  (2) 
to  be  x^=b,  X  =  {x  >  0  and  integer  ),X‘  =  X^=X  and  show  that  Theorem  1 
implies:  (3)  is  (14)  for  n  =  q. 

By  Theorem  1  w,  =  x‘  and  the  smallest  value  of  Wj  consistent  with  condition  (4) 
is  obtained  by  taking  x'  =  6  + 1  (since  x^^b),  which  gives  w,  =  6  + 1 .  According  to  the 
relation  (5) 

w, = d^'>xr+d<^-'>xr+...+<!r’<-r 

Coefficients  dj’"”  for  y  =  1, 2,. ..,(7-2  are  divided  by  (6  +  1).  To  obtain  Wj  relatively 
prime  with  w,  =  6 + 1  it  is  necessary  to  take  x'l,  ^  0  and  it  suffices  to  choose  a  value  of 
x"_,  that  yields 

<!l”x;.,=lmod(6  +  l) 
or 

dlVi'^x''.^  =  b  mod(6  + 1 ) . 

As  it  can  be  seen  from  ( i )  for  an  arbitrary  b  the  last  inequality  is  obtainable  and  this 
occurs  only  for  x" ,  =  6.  So  we  assume  x'l,  =b.  According  to  (5)  x"  should  violate 
(!'),  which  may  be  taken  in  the  form  X,- ^6  for  any  y  =  l,2,...,^-2.  The  smallest  of  the 
coefficients  dj’"'*  is  d,’"‘\  To  obtain  a  smaller  value  for  Wj  we  take  x,"=6+l 
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and  JCj'=x,''=...=x*  1=0.  Thus  Wj  =d,‘*“*’(h+ l)+d‘l7‘’&  =  d‘*MTic  last  inequality  is 

obtained  with  the  use  of  identity  ( i )  and  the  definition  d]*’,  (15).  Further  (3)  and  the 
identity  ( ii )  immediately  yields  (16)  for  n  =  q. 

Now  we  shall  derive  Theorem  4  fiom  Theorem  S.’Renumber  the  variables 
Xj  in  (14)  in  the  reverse  order  by  introducing  index  p  =  n-y+l  and  define 

(17)  P  =  *.2 . n. 

Then  (14)  may  be  written  as 

(18) 

Noting  n-j  =  p-  \  (13)  implies 

(19)  4“'  =((h  +  l)*-(h  +  l)'^‘]/b  =  (l?  +  l)'-‘[(6  +  l)'-''*‘-l]/h. 

Then  the  relation 

a"  - 1  =  (a - l)(a"-'  +  a'"-^+...+l) 
for  a  =  b+\  and  m  =  /i-p  +  l  gives 

[(6  + 1  )"-'"*  - 1]  /  6  =  (1)  + 1 )'"' +  (f>  + 1)-'^' +. . . +(6 + 1 )  + 1 
and  from  (19)  it  follows 

(20)  c;'”=(&+l)’-‘+(f>+l)'’^+...+(6+l)'-‘ =  £(&+!)*. 

The  last  relation  implies  the  following  identities 


(iii) 

c/»=(d  +  l)'^'. 

(iv) 

cJ,»’=(6+l)’”'+cJ,*-‘\ 

We  shall  follow  again  the  reasoning  of  mathematical  induction.  As  can  be  seen  by 
substitution  for  n  =  1,  (18)  has  a  single  solution  x,  =  b.  To  obtain  (18)  for  an  arbitrary 
l^q^n  consider  the  following  system  of  two  equations 

(21) 


(22) 


X, +Xj+...+X, 
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where  (21)is  (18)  for  n  =  ^-l  and  is  taken  as  (1)  in  Theorem  3,  and  (22)  istakenas 
(2)  in  Theorem  3.  Relation  (22)  is  equivalent  to  x,  w.^isre,by  mathematical 
inducdon,  it  is  assumed  that  equation  (21)  has  a  single  solution: 
x^=b  for  p  =  1,2,...,^  - 1.  As  in.the  proof  of  Theorem  3  select  Wj =1.  Then 

«-i  • 

(23)  i/,  =  subject  to  :S^^-1. 

p-i  /»i 

The  largest  coefficient  among  is  cj*"*’  .From  (17)  and  (13) 

c,'*'‘’=<l7‘»=[(b+l)»-‘-ll/6. 

In  addition,  the  constraint  in  (23)  implies 

^-1  « 

pml  pm{ 

and  from  (23)  it  follows  that  U^  ~  - 1).  Further 

(24)  >^1  >  t/,  -  0,0  =  c{*-'>(qb - 1)  -  =  (h  +  !)’-*  - q . 

Finally  we  require  =  ^c^’'‘’x"  and  select  x"  =  0  for  p  =  l,2,...,q~2  and 

x''_^  =  i  + 1 .  This  choice  of  x"  provides  the  smallest  possible  value  of  Wj ,  which  meets 
condition  (5)  of  Theorem  1  (violating  x,_,  as  (1*))  and  satisfies  (24),  because  as  a 
result  of  the  identity  (iii)  Wj  =cji7'’(& +  !)  =  (& +  !)*'*•  Weights  >v,=l  and 

Wj  =  (h +!)’■*  for  (21)  and  (22)  and  the  identity  ( iv )  lead  straightforwardly  to  (18) 
for  n  =  q. 

Concluding  Observation. 

It  is  important  to  note  that  the  results  of  [1],  [2]  and  [S]  for  aggregating  the 
system  of  n  integer  •  valued  equations  were  obtained  from  considerations  which 
essentially  differ  from  the  approach  outlined  in  §2,  in  which  equations  are  aggregated  step 
by  step  two  equations  at  a  time.  In  [1],  [2]  and  [S]  for  the  first  time  in  the  literature, 
analytical  formulae  were  given  for  weights  corresponding  to  each  equation  of  the  system 
CO  be  aggregated  ( in  [2]  and  [5]  for  common  equal  right  hand  sides  and  in  [1]  for  the 
general  case  of  arbitrary  right  hand  sides).  The  results  presented  in  this  last  section  reveal 
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the  existence  of  a  tight  linkage  between  the  two  different  approaches  to  the  problem  of 
aggregating  integer  -  valued  equations. 
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1.  Background. 

Recent  developments  have  shown  the  value  of  integrating  metaheuristic  approaches 
with  special  methods  for  generating  new  neighborhood  structures,  called  ejection  chain 
methods,  and  with  processes  derived  from  connection  models  embodied  in  neural  networks. 
We  focus  on  the  implications  and  consequences  of  such  integrated  procedures,  with  panicular 
attention  to  the  tabu  search  framework.  At  the  same  time,  we  establish  relationships  between 
this  framework  and  that  of  other  mctahcuristics,  demonstrating  ways  to  enhance  "population 
combining  models"  (which  include  genetic  algorithms)  and  "threshold  based  models"  (which 
include  simulated  annealing),  drawing  on  search  paradigms  from  tabu  search  that  offer  ways 
to  extend  these  other  approaches.  We  also  report  developments  by  which  tabu  search  has 
provided  advances  in  the  uses  of  neural  network  models  in  optimization.  Computational 
studies  are  cited  that  confirm  the  practical  merit  of  these  advances. 

2.  Ejection  Chain  Processes. 

Ejection  chain  methods  give  a  useful  way  to  build  compound  neighborhoods,  with  the 
goal  of  generating  more  powerful  moves  for  solving  discrete  optimization  problems.  Ejection 
chains  combine  and  generalize  ideas  from  a  number  of  sources,  including  classical  alternating 
paths  from  graph  theory,  network  related  base  exchange  constructions  in  matroid  optimization, 
and  bounding  form  structures  for  solving  integer  programming  problems.  Each  of  these 
embodies  a  related  theme  whose  incorporation  into  neighborhood  search  offers  new 
approaches  to  combinatorial  optimization  applications. 
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An  ejection  chain  is  initiated  by  selecting  a  set  of  elements  to  undergo  a  change  of 
state  (e.g.,  to  occupy  new  posidons  or  receive  new  values).  The  result  of  this  change  leads  to 
identifying  a  collection  of  other  sets,  with  the  property  that  the  elements  of  at  least  one  must 
be  "ejected  from"  their  current  states.  State-change  steps  and  ejection  steps  typically 
alternate,  and  the  options  for  each  depend  on  the  cumulative  effect  of  previous  steps  (usually, 
but  not  necessarily,  being  influenced  byuhe  step  immediately  preceding).  In  some  cases,  a 
cascading  sequence  of  operations  may  be  triggered  representing  a  domino  effect.  The  ejection 
chain  terminology  provides  a  unifying  thread  that  links  a  collection  of  useful  procedures  for 
exploiting  structure,  without  establishing  a  narrow  membership  that  excludes  other  forms  of 
classification. 

A  number  of  methods  deriving  from  this  perspective  recently  have  appeared  in  the 
literature.  A  node  (or  block)  ejection  procedure  has  been  proposed  by  Glover  (1991a)  for 
traveling  salesman  problems,  and  extended  to  provide  new  approaches  for  quadratic 
assignment  and  vehicle  routing  problems.  Laguna,  et.  al.,  (1991)  introduce  an  ejection  chain 
approach  in  conjunction  with  a  tabu  search  procedure  for  multilevel  generalized  assignment 
problems,  and  demonstrate  that  ejection  chains  even  of  "small  depth"  produce  highly  effective 
results  in  this  context.  Ejection  chain  strategies  are  also  proposed  for  clique  panitioning  of 
Dorndorf  and  Pesch  (1992),  similarly  yielding  good  outcomes. 

Recently,  ejection  chain  strategies  have  been  developed  for  traveling  salesman 
problems  that  are  founded  on  the  notion  of  creating  a  reference  structure  to  guide  the 
generation  of  acceptable  moves  (Glover,  1992a).  Such  a  structure  can  be  controlled  to 
produce  transitions  between  tours  with  desirable  properties,  generating  alternating  paths  (or 
collections  of  such  paths)  of  a  non-standard  yet  advantageous  type.  These  paths  yield  a 
combinatorial  leverage  effect  which  provides  solutions  that  are  best  among  exponential 
numbers  of  alternatives,  by  the  investment  of  a  low  polynomial  degree  of  effort. 

In  particular,  special  algorithms  enable  solutions  dominating  0(n2")  alternatives  to  be 
obtained  with  O(n^)  effort,  and  solutions  dominating  0((n/2)!)  alternatives  to  be  obtained  with 
O(n^)  effon.  We  show  tabu  search  strategies  provide  a  way  to  extend  the  application  of  these 
results,  leading  to  new  solution  procedures  not  only  in  the  TSP  setting,  but  also  for  a  much 
broader  collection  of  graph  and  network-related  problems. 

3.  Links  with  Other  Methods. 

Relevant  ways  to  visualize  relationships  between  tabu  search  and  other  procedures  like 
simulated  annealing  and  genetic  algorithms  provide  a  basis  for  understanding  similarities  and 
contrasts  in  their  philosophies,  and  for  creating  potentially  valuable  hybrid  combinations  of 
these  approaches.  We  suggest  how  elements  of  tabu  search  can  add  a  useful  dimension  to 
such  approaches,  drawing  on  observations  from  Glover  and  Laguna  (1992).  We  assume  the 
reader  has  a  modest  familiarity  with  the  general  form  of  these  approaches  as  a  foundation  for 
the  following  discussion. 
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Simulated  Annealing.  Undoubtedly  the  most  prominent  contrast  between  simulated 
annealing  and  tabu  search  is  the  focus  on  exploiting  memory  in  ubu  search  that  is  absent 
from  simulated  annealing.  The  introduction  of  this  focus  entails  associated  differences  in 
search  mechanisms,  and  in  the  elements  on  which  they  operate.  Several  elements  in  addition 
to  memory  are  fundamental  for  understanding  the  relationship  between  the  methods.  We 
consider  three  such  elements  in  order  of  increasing  importance. 

First,  tabu  search  emphasizes  careful  probing  of  successive  neighborhoods  to  identify 
moves  of  high  quality,  employing  candidate  list  approaches.  This  contrasts  with  the  simulated 
annealing  approach  of  randomly  sampling  among  these  moves  to  apply  an  acceptance 
criterion  that  disregards  the  quality  of  other  moves  available.  (Such  an  acceptance  criterion 
provides  the  sole  basis  for  soning  the  moves  selected  in  the  SA  method.)  The  relevance  of 
this  difference  in  orientation  is  accentuated  for  tabu  search,  since  its  neighborhoods  include 
..1  linkages  based  on  history,  and  therefore  yield  access  to  information  for  selecting  moves  that 

is  not  available  in  neighborhoods  of  the  type  used  in  simulated  annealing. 

Next,  tabu  search  evaluates  the  relative  attractiveness  of  moves  not  only  in  relation  to 
objective  function  change,  but  also  in  relation  to  factors  of  influence.  Both  types  of  measures 
are  significantly  affected  in  tabu  search  by  the  differentiation  among  move  attributes,  as 
embodied  in  tabu  restrictions  and  aspiration  criteria,  and  in  turn  by  relationships  manifested  in 
recency,  frequency,  and  sequential  interdependence  (hence,  again,  involving  recourse  to 
memory).  Other  aspects  of  the  state  of  search  also  affect  these  measures,  which  depend  on 
the  direction  of  the  current  trajectory  and  the  region  visited. 

Finally  TS  emphasizes  guiding  the  search  by  reference  to  multiple  thresholds,  reflected 
in  the  tenures  for  tabu-active  attributes  and  in  the  conditional  stipulations  of  aspiration 
criteria.  This  may  be  contrasted  to  the  simulated  annealing  reliance  on  guiding  the  search  by 
reference  to  the  single  threshold  implicit  in  the  temperature  parameter.  The  treatment  of 
thresholds  by  the  two  methods  compounds  this  difference  between  them.  Tabu  search  varies 
its  thresholds  non-monotonically,  reflecting  the  conception  that  multidirectional  parameter 
changes  are  essential  to  adapt  to  different  conditions,  and  to  provide  a  basis  for  locating 
alternatives  that  might  otherwise  be  missed.  This  contrasts  with  the  simulated  annealing 
philosophy  of  adhering  to  a  temperature  parameter  that  only  changes  monotonically. 

Hybrids  are  now  emerging  that  are  taking  preliminary  steps  to  bridge  some  of  these 
differences,  particularly  in  the  realm  of  transcending  the  simulated  annealing  reliance  on  a 
monotonic  temperature  parameter.  A  hybrid  n^ethod  that  allows  temperature  to  be 
strategically  manipulated,  rather  than  progressively  diminished,  has  been  shown  to  yield 
improved  performance  over  standard  SA  approaches,  as  noted  in  the  work  by  Osman  (1992). 

A  hybrid  method  that  expands  the  SA  basis  for  move  evaluations  also  has  been  found  to 
perform  better  than  standard  simulated  annealing  in  the  study  by  Kassou  (1992). 

Consideration  of  these  findings  invites  the  question  of  whether  removing  the  memory 
scaffolding  of  tabu  search  and  retaining  its  other  features  may  yield  a  viable  method  in  its 
own  right.  A  foundation  for  doing  this  by  a  "tabu  thresholding  method"  is  described  in 
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(Glover,  1992b),  and  is  reported  in  a  study  of  graph  layout  and  design  problems  by  Verdejo 
and  Cunquero  (1992)  to  perform  more  effectively  than  the  previously  best  methods  for  these 
problems. 

Genetic  Algorithms.  Genetic  algorithms  offer  a  somewhat  different  set  of 
comparisons  and  contrasts  with  tabu  sewh.  GAs  are  based  on  selecting  subsets  (usually 
pairs)  of  solutions  from  a  populadon,  called  parents,  and  combining  them  to  produce  new 
solutions  called  children.  Rules  of  combination  to  yield  children  are  based  on  the  genetic 
notion  of  crossover,  which  consists  of  interchanging  soludon  values  of  particular  variables, 
together  with  occasional  operadons  such  as  random  value  changes.  Children  that  pass  a 
survivability  test,  probabilistically  biased  to  favor  those  of  superior  quality,  are  then  available 
to  be  chosen  as  parents  of  the  next  generation.  The  choice  of  parents  to  be  matched  in  each 
generadon  is  based  on  random  or  biased  random  sampling  from  the  populadon  (in  some 
parallel  versions  executed  over  separate  subpopuladons  whose  best  members  are  periodically 
exchanged  or  shared).  Genetic  terminology  customarily  refers  to  solutions  as  chromosomes, 
variables  as  genes,  and  values  of  variables  as  alleles. 

By  means  of  coding  conventions,  the  genes  of  genetic  algorithms  may  be  compared  to 
attributes  in  tabu  search,  or  more  precisely  to  attributes  in  the  form  underlying  certain  TS 
measures  of  frequency  based  memory.  Introducing  memory  in  GAs  to  track  the  history  of 
genes  and  their  alleles  over  subpopulations  would  provide  an  immediate  and  natural  way  to 
create  a  hybrid  with  TS. 

Some  important  differences  between  genes  and  attributes  are  worth  noting,  however. 
Differendation  of  attributes  into  from  and  to  components,  each  having  different  memory 
functions,  do  not  have  a  counterpart  in  genedc  algorithms.  This  results  because  GAs  are 
organized  to  operate  without  reference  to  moves  (although,  strictly  speaking,  combination  by 
crossover  can  be  viewed  as  a  special  type  of  move).  Another  disdnetion  derives  from 
differences  in  the  use  of  coding  conventions.  Although  an  attribute  change,  from  a  state  to  its 
complement,  can  be  encoded  in  a  zero-one  variable,  such  a  variable  does  not  necessarily 
provide  a  convenient  or  useful  representation  for  the  transformadons  provided  by  moves. 

Tabu  restrictions  and  aspiration  criteria  handle  the  binary  aspects  of  complementarity  without 
requiring  explicit  reference  to  a  zero-one  x  vector  or  two-valued  funedons.  Adopdng  a 
similar  orienution  (relative  to  the  special  class  of  moves  embodied  in  crossover)  might  yield 
benefits  for  genetic  algorithms  in  dealing  with  issues  of  genetic  represenudon,  which 
currently  pose  difficult  questions  (see,  e.g.,  Liepens  and  Vose  (19W)). 

A  contrast  to  be  noted  between  genetic  algorithms  and  tabu  search  arises  in  the 
treatment  of  context,  i.e.,  in  the  consideration  given  to  structure  inherent  in  different  problem 
classes.  For  tabu  search,  context  is  fundamental,  embodied  in  the  interplay  of  attribute 
definidons  and  the  determination  of  move  neighborhoods,  and  in  the  choice  of  condidons  to 
define  tabu  restrictions.  Context  is  also  implicit  in  the  idendficadon  of  amended  evaluations 
created  in  association  with  longer  term  memory,  and  in  the  regionally  dependent 
neighborhoods  and  evaluations  of  strategic  oscillation. 


At  the  opposite  end  of  the  spectrum,  GA  literature  characteristically  stresses  the 
freedom  of  its  rules  from  the  influence  of  context.  Crossover,  in  pardcular,  is  a  context 
neutral  operation,  which  assumes  no  reliance  on  conditions  that  solutions  must  obey  in  a 
particular  problem  setting,  just  as  genes  make  no  reference  to  the  environment  as  they  follow 
their  instructions  for  recombination  (except,  perhaps,  in  the  case  of  mutation).  Practical 
application,  however,  generally  renders  this  an  inconvenient  assumption,  making  solutions  of 
interest  difficult  to  find.  Consequently,  a  good  deal  of  effort  in  GA  implementation  is 
devoted  to  developing  "special  crossover"  operations  that  compensate  for  the  difficulties 
created  by  context,  effectively  reintroducing  it  on  a  case  by  case  basis. 

The  chief  method  by  which  modern  genetic  algorithms  and  their  cousins  handle 
structure  is  by  relegadng  its  treatment  to  some  other  method.  That  is,  genetic  algorithms 
combine  solutions  by  their  parent-children  processes  at  one  level,  and  then  a  different  method 
takes  over  to  operate  on  the  resulting  solutions  to  produce  new  solutions.  These  new 
solutions  in  turn  are  submitted  to  be  recombined  by  the  GA  processes.  In  these  versions, 
pioneered  by  Miihlenbein,  Gorges-Schleuter,  and  Kramer  (1988)  and  also  advanced  by  Davis 
(1991)  and  Ulder,  et  al.  (1991),  genetic  algonthms  already  take  the  form  of  hybrid  methods. 
Hence  there  is  a  natural  basis  for  marrying  GA  and  TS  procedures  in  such  approaches.  But 
genetic  algorithms  and  tabu  search  also  can  be  joined  in  a  more  fundamental  way. 

Specifically,  tabu  search  strategies  for  intensification  and  diversification  are  based  on 
the  following  question:  how  can  information  be  extracted  from  a  set  of  good  solutions  to 
help  uncover  additional  (and  better)  solutions?  From  one  point  of  view,  GAs  provide  an 
approach  for  answering  this  question,  consisting  of  putting  solutions  together  and  interchang¬ 
ing  components  (in  some  loosely  defined  sense,  if  traditional  crossover  is  not  strictly 
enforced).  Tabu  search,  by  contrast,  seeks  an  answer  by  utilizing  processes  that  specifically 
incorporate  neighborhood  structures  into  their  design. 

Augmented  by  historical  information,  neighborhood  structures  are  used  in  TS  as  a 
basis  for  applying  penalties  and  incentives  to  induce  attributes  of  good  solutions  to  become 
incorporated  into  current  solutions.  Consequently,  although  it  may  be  meaningless  to 
interchange  or  otherwise  incorporate  a  set  of  attributes  from  one  solution  into  another  in  a 
wholesale  fashion,  as  attempted  in  recombination  operations,  a  stepwise  approach  to  this  goal 
through  the  use  of  neighborhood  structures  is  entirely  practicable.  This  observation  provides 
a  basis  for  creating  structured  combinations  of  solutions  that  embody  desired  characteristics 
such  as  feasibility  (Glover,  1991b).  The  use  of  these  structured  combinations  makes  it 
possible  to  integrate  selected  subsets  of  solutions  in  any  system  that  satisfies  three  basic 
properties.  Instead  of  being  compelled  to  create  new  types  of  crossover  to  remove  deficien¬ 
cies  of  standard  operators  upon  being  confronted  by  changing  contexts,  this  approach 
addresses  context  directly  and  makes  it  an  essential  part  of  the  design  for  generating  combina¬ 
tions.  The  current  trend  of  genetic  algorithms  seems  to  be  increasingly  compatible  with  this 
perspective,  particularly  in  the  work  by  Miihlenbein  (1992),  and  this  could  provide  a  basis  for 
a  significant  hybrid  combination  of  genetic  algorithm  and  tabu  search  ideas. 
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4.  Recent  Advances  Using  Tabu  Search  in  Neural  Networks. 

Our  examination  of  linking  tabu  search  and  neural  networks  in  optimization  derives  fiom 
joint  work  by  J.  Chakrapani  and  J.  Skonn-Kapov  (1992a,  1992b.  1993).  There  are  both 
benefits  and  challenges  associated  with  using  a  massively  parallel  computer 
architecture  for  tabu  search,  as  applied  to'solving  NP  hari  combinatorial  problems.  Our 
starting  point  for  addressing  these  issues  is  a  connectionist  model  for  solving  the  quadratic 
assignment  problem  (QAP),  obtained  by  modifying  Boltzmann  machine  used  by  Aarts  and 
Korst  (1989).  This  leads  to  a  massively  parallel  tabu  search  algorithm  for  the  QAP,  imple¬ 
mented  on  the  Connection  Machine  CM-2. 

Not  only  is  the  connection  machine  extremely  interesting  as  a  massively  parallel 
computer  architecture  for  solving  combinatorial  optimization  problems  arising  in  different 
engineering  applications,  some  optimization  problems  are  also  created  in  the  attempt  to  use 
the  machine  as  effectively  as  possible.  Depending  on  the  application,  the  communication 
time  is  not  trivial.  For  the  QAP,  55%  of  time  is  spent  on  interprocessor  communication, 
which  stems  from  a  dynamic  communication  pattern.  There  is  another  class  of  applications 
for  which  the  communication  pattern  is  static,  i.e.  the  memory  locations  defining  the  source 
and  destinations  of  messages  do  not  change,  only  the  communicated  data  changes.  In  such 
applications,  considerable  time  can  be  saved  by  allocating  processors  to  chips  according  to  the 
structure  of  their  communication  pattern  (Dahl,  1990).  A  tabu  search  approach  to  the 
mapping  and  scheduling  problem  successfully  improves  communication  time  on  a  massively 
parallel  system,  as  demonstrated  by  computation  results  on  some  data  sets  from  the  literature. 

Connectionist  models  arc  constructed  to  follow  the  analogy  with  neural  networks  in 
the  human  brain,  and  consist  of  nodes  representing  neurons,  and  arcs  representing  a  pattern  of 
connectivity  among  the  neurons.  An  activity  level  is  associated  with  each  node,  and  weights 
or  connection  strengths  are  associated  with  each  arc.  Activity  levels  and  connection  strengths 
can  change  according  to  functions  directing  the  system’s  behavior.  Depending  on  the  values 
an  activity  level  may  take,  connectionist  model  are  classified  as  analog  or  binary.  Boltzmann 
machines  are  connectionist  models  employing  binary  activity  levels,  determined  probabi¬ 
listically  according  to  the  Boltzmann  equation. 

With  the  proper  setting  of  connection  strengths,  one  can  establish  the  equivalence 
between  the  objective  function  of  a  combinatorial  problem  and  the  function  governing  the 
behavior  of  the  connectionist  model.  In  such  a  case  the  equilibrium  points  of  the  system's 
function  correspond  to  the  local  minima  of  the  underlying  combinatorial  problem. 

Instead  of  using  simulated  annealing  to  escape  from  local  optima  (as  in  Boltzmann 
machines),  we  have  designed  a  related  connectionist  model  in  which  tabu  search  is  used. 

This  represents  the  fust  study  replacing  simulated  annealing  with  tabu  search  in  a 
connectionist  model,  and  the  first  study  involving  dynamically  changing  connection  strengths 
for  such  problems.  The  results  on  the  set  of  QAPs  from  literature  show  that  a  connection 
model  based  on  tabu  search  performs  better  than  such  a  model  based  on  simulated  annealing. 
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The  connection  model  itself,  though,  still  does  not  yield  a  framework  for  producing  a  method 
as  effective  as  direct  heuristic  algorithms  for  the  QAP  that  also  udlize  tabu  search.  The 
inefficiency  is  due  to  the  fact  that  any  binary  matrix  is  a  feasible  configuradon  for  a  model, 
and  in  order  to  reach  a  permutadon  matrix  ( i.e.  a  feasible  solution  to  the  QAP),  large  bias 
connection  strengths  are  needed,  which  in  turn  result  in  poor  solutions  to  the  QAP.  In  order  to 
’marry'  the  idea  of  massively  parallel  connectionistic  approach,  and  the  success  of  swap 
moves  for  the  QAPs  (which  restricts  the  search  to  the  feasible  configurations  only),  we  have 
progressed  to  the  design  of  a  massively  parallel  pairwise  exchange  algorithm  implemented  on 
the  Connection  Machine  CM-2. 

Apan  from  designing  an  elaborate  algorithmic  strategy  using  various  elements  of  tabu 
search  (e.g.  aspiration,  diversification,  intensification  and  varying  tabu  list  sizes),  a  straight¬ 
forward  implementation  on  the  Connection  Machine,  obtained  by  identifying  a  pairwise 
exchange  with  a  processor,  produces  a  very  inefficient  utilization  of  both  time  and  memory. 
On  the  other  hand,  an  efficient  implementation  requires  a  fine  grain  decomposition  of  the 
problem  into  small  identical  subproblems  suitable  for  data-level  parallel  computing.  Assum¬ 
ing  n*n  processors,  such  a  decomposition  results  in  only  a  logarithmic  increase  in  time  per 
iteration  as  a  function  of  the  size  of  the  problem.  The  logarithmic  factor  theoretically  comes 
from  finding  the  maximum  of  n  numbers  with  n  processors,  which  in  CM-2  is  done  by  the 
hardware,  and  is  not  significant  (e.g.  the  time  increases  by  0. 1  or  0.2  seconds  as  the  problem 
size  grows  from  42  to  100).  If  the  number  of  processors  available  is  smaller  than  the 
neighborhood  size,  virtual  processing  is  invoked,  which  requires  increased  time  per  iteration, 
opening  possibilities  for  designing  new  strategies  to  handle  larger  problems. 

After  successfully  solving  QAPs  of  size  up  to  100,  two  questions  remain:  (1)  How  to 
’push’  the  problem  size  even  further;  (2)  How  to  use  massively  parallel  computer  architecture 
to  gain  more  understanding  of  the  tabu  strategy  itself.  We  undertake  to  give  answers  to  these 
questions  by  addressing  the  problem  of  mapping  tasks  to  processors  in  a  multiprocessor 
system,  in  order  to  minimize  communication  time.  We  assume  that  communication  among 
tasks  follows  a  static  pattern,  and  that  all  processors  are  identical  and  all  tasks  are  similar. 

We  stipulate  that  the  number  of  processors  equals  the  number  of  tasks,  by  introducing 
"dummy"  tasks  if  there  are  more  processors.  The  case  where  there  are  more  tasks  is  treated 
within  a  virtual  environment  as  if  there  are  enough  processors  so  that  each  task  can  be 
mapped  to  a  single  processor. 

We  therefore  distinguish  between  the  physical  nodes  of  the  multiprocessor  system  and 
the  processors  themselves.  Each  node  may  contain  more  than  one  processor  (virtually  or 
otherwise).  Processors  in  the  same  node  can  communicate  among  themselves  with  minimal 
time  spent  in  communication  (assumed  to  be  zero).  (Communication  time  for  processors  in 
different  nodes  is  dependent  on  the  architecture  of  the  system.  We  approximate  the  commu¬ 
nication  time  between  two  processors  by  the  number  of  links  that  a  message  has  to  travel 
(dilation)  and  the  total  communication  time  by  the  sum  of  the  individual  dilations.  Note  that 
even  when  there  are  only  two  nodes,  and  the  communication  pattern  is  a  graph  with  unit  edge 
weights,  the  problem  is  NP-hard. 
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We  formulate  the  problem  as  a  quadratic  assignment  problem  of  mapping  n  tasks  to  n 
processors.  Denote  by  D  the  distance  matrix  between  processors  and  by  F  the  matrix 
representing  the  communication  between  tasks.  Such  a  formulation  is  general  enough  to 
cover  different  architectures  and  different  communication  patterns.  Since  this  would  be  a 
preprocessing  routine  to  an  application,  emphasis  should  be  given  to  fast  algorithms.  Also, 
since  the  application  will  run  on  a  parallB  machine,  parallel  algorithms  are  preferable  to 
reduce  sequential  bottlenecks. 

We  develop  a  heuristic  algorithm  based  on  tabu  search  for  handling  this  problem.  The 
heuristic  employs  a  simple  choice  rule  and  neighborhood  structure  of  iteratively  selecting  a 
pair  of  tasks  in  a  greedy  fashion,  and  swapping  the  processors  to  which  they  are  mapped.  In 
our  parallel  implementation  two  levels  of  parallelism  are  employed.  First,  the  candidate  tasks 
to  be  swapped  are  identified  in  parallel.  Second,  more  than  one  pair  of  tasks  is  swapped  in  a 
single  iteration.  The  computed  effect  of  a  single  swapping  is  based  on  the  assumption  that  no 
other  swapping  takes  place  at  the  current  iteration.  When  performing  multiple  swaps,  the 
cumulative  effect  of  the  swapping  may  not  correspond  to  the  sum  of  the  individual  effects. 

We  show  how  this  can  lead  to  an  inferior  performance  and  illustrate  the  elements  of  our 
heuristic  that  makes  it  robust  under  these  circumstances. 

The  heuristic  is  tested  on  the  hypercube  architecture,  and  implemented  on  CM-2. 
Computations  are  performed  on  data  originating  from  finite  element  application  and  of  size 
ranging  from  8000  up  to  64000  tasks.  The  result  yields  a  new  form  of  connection  approach 
deriving  from  an  integration  with  tabu  search,  and  overcomes  limitations  of  earlier  connection 
models  in  this  setting.  We  anticipate  the  value  of  integrating  tabu  search  with  connection 
models  for  additional  types  of  applications  in  the  future. 
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Abstract 

Recent  advances  in  computing  technology  have  created  a  situation  where  we  can 
solve  larger  problems  than  we  can  understand.  This  is  true  for  most  linear 
programs,  and  it  is  becoming  increasingly  true  for  nonlinear  and  integer  forms.  In 
addition,  model  management  can  be  confounded  by  large  models,  especially  when 
they  are  eclectic.  To  deal  with  this  bottleneck  in  productive  use  of  mathematical 
programming  for  decision  support,  a  research  project  began  in  1985  to  develop  an 
Intelligent  Mathematical  Programming  System  (IMPS). 

Some  of  the  problems  we  address  in  the  IMPS  project  are: 

•  Find  a  reformulation  that  simplifies  the  model. 

•  Infer  data  relations  that  are  necessary  for  the  instance  to  be  well  posed  (i.e., 
feasible  and  bounded). 

•  Give  different  views  of  the  model,  or  an  instance  of  it,  that  provide  different 
insights. 

•  Answer  questions  of  sensitivity:  What  if...?.  Why...?,  and  Why  not...? 
Furthermore,  form  responses  in  English,  graphics  and  other  forms  under 
user  control. 

•  Provide  aids  for  model  debugging,  such  as  why  an  instance  is  infeasible. 

•  Provide  aids  for  documenting  a  model  and  scenarios. 

The  IMPS  project  is  focused  on  producing  new  ideas  for  modeling  and  analysis 
support  and  has  produced  new  approaches  to  model  formulation,  management  and 
applications.  After  elaborating  on  this  background,  including  our  meaning  of 
intelligence  and  opportunities  for  creating  an  intelligent  computing  environment,  we 
present  some  examples  of  results,  both  positive  and  negative. 

Analysis  support  is  one  of  the  areas  the  IMPS  project  has  pursued  extensively, 
producing  an  advanced  software  system,  called  ANALYZE,  which  includes  rule- 
based  interpretations  of  results.  The  results  could  be  just  a  prototype  instance 
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without  a  solution,  where  analysis  looks  for  reductions  and  embedded  structures  in 
the  interest  of  verification  and  documentation  or  other  elements  of  model 
management.  The  results  could  be  optimal  solutions  of  scenarios,  where  analysis 
not  only  answers  conventional  sensitivity  questions,  but  also  probes  more  deeply 
into  the  meaning  of  the  results.  The  results  could  be  infeasible  or  anomalous,  where 
analysis  is  debugging  the  runs  to  diagnose  the  cause  of  the  infeasibility  or  anomaly. 

A  more  recent  development  is  a  new  approach  to  the  pooling  problem,  which  is  a 
non-convex  mathematical  program,  that  gives  exact  answers  to  sensitivity  questions, 
rather  than  the  usual  methods  based  on  Lagrange  multipliers,  which  can  give 
erroneous  answers.  We  shall  demonstrate  how  this  is  done  using  computational 
geometry. 

One  of  the  negative  results,  which  we  illustrate,  is  the  use  of  neural  networks  for 
assisting  formulation.  On  the  surface,  the  approach  seems  reasonable,  but  we  failed 
to  find  an  appropriate  neural  net  to  represent  the  problems  we  describe.  Although 
this  approach  has  not  been  abandoned  completely,  it  has  not  worked  so  far, 
particularly  compared  with  other  approaches  we  have  taken,  notably  syntax-directed 
modeling  assistance. 

After  describing  some  of  the  results,  both  positive  and  negative,  we  summarize 
current  and  future  activities  within  (he  IMPS  project.  An  extensive  bibliography  is 
included  to  indicate  the  great  amount  of  research  and  development  activity  that  has 
emerged  over  the  past  few  years  to  address  the  problems  in  modeling  and  analysis. 
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Abstract 

When  are  two  mathematical  programs  equivalent?  That  is  the  question  addressed 
in  this  paper.  Since  the  beginnings  of  mathematical  programming,  people  have 
found  reformulations  of  mathematical  programs  that  have  certain  desirable 
characteristics.  Usually,  the  reformulation  reveals  a  special  structure,  such  as  a 
network,  that  speeds  up  the  numerical  optimization.  More  recently,  interest  in 
reformulation  is  motivated  by  other  characteristics,  such  as  for  better  model 
management  or  understanding  results  more  deeply  for  the  decision  problem 
represented  by  the  mathematical  program. 

The  usual  way  in  which  one  establishes  equivalence  between  the  original 
formulation  and  another  one  is  to  provide  mappings  between  the  two  formulations 
that  preserve  feasibility  and  optimality.  We  demonstrate  that  this  mathematical 
approach  to  equivalence  provides  only  a  necessary  condition,  not  a  sufficient  one,  by 
showing  that  it  also  makes  equivalent  mathematical  programs  that  really  have  no 
relation  to  each  other.  Besides  the  mathematical  approach,  which  has  been  used  for 
decades,  there  is  a  linguistic  approach  that  has  emerged  more  recently.  Whereas 
the  mathematical  approach  provides  a  necessary  condition,  the  linguistic  approach 
provides  a  sufficient  one.  The  linguistic  approach  that  has  been  proposed  is  too 
strong  to  contain  necessary  conditions.  In  particular,  it  does  not  allow  variable  re¬ 
definition,  except  in  name.  The  first  part  of  this  paper  elaborates  on  these 
approaches  and  describes  some  difficulties  with  a  formal  definition  of  equivalence 
through  examples.  Then,  representative  cases,  mostly  taken  from  the  literature,  are 
presented  in  order  to  build  an  intuition  about  equivalence. 

The  second  part  considers  operational  definitions  of  equivalence  and  examines  their 
scope.  The  necessary  condition  of  mappings,  taken  from  the  mathematical 
approach,  is  included,  but  a  key  difference  pertains  to  separation  of  data  from 
structure.  Our  motive  for  having  a  formal  definition  of  equivalence  is  to  create  an 
artificially  intelligent  environment  for  modeling,  where  model  matching  can  aid  an 
initial  formulation,  and  reformulation  can  be  automatic. 


Vladimir  A.  Gurvich 

Extremal  integer  sequences  with  forbidden  sums 
Extended  Abstract 

1.  Basic  notions  and  notations.  Let  =  <1,2,,,}  be  the  set 

of  the  integer  positive  numbers,  s  =  fSj  .  .  be  a  finite 

sequence  of  numbers  from  .  then  let  £  =  t(s)  be  the  length, 
n(s)  =  s^+  S2+...+  be  the  sum  and  mfs)  =  n(s)/t(s)  be  the  mean 
of  the  sequence  s  .  The  sequence  s'  will  be  called  an  interval  of  s 
and  denoted  as  s'S  s  .  if  there  exist  the  numbers  i  ,  j  e 
such  that  1  <  i  <  j  <  i(s)  and  s'  =  (s^  ,  .  Sj). 

Then  let  f  ^  Z^  be  a  finite  set  called  the  set  of  the  forbidden 
sums.  The  sequence  s  will  be  called  F-excluding  if  it  does  not 
contain  an  interval  which  sum  belongs  to  f  .  We  denote  by  EX(i;  F) 
the  set  of  all  the  f-excluding  sequences  of  the  length  I  ,  that  is 

(1)  EX(t;  F)  =  is  I  t(s)  =  t  and  nCs')  €  f  V  s'  £  s). 

We  denote  by  Gex(F)  the  infinite  F-excluding  lexicographically 
minimal  sequence  and  by  Gexd;  F)  its  initial  interval  of  the  length 
£  ."The  greedy  algorithm"  realizes  Gex(F)  by  induction.  Successively 
for  each  £  =  1,  2,...  the  sequence  Gex(t-,  F)  is  obtained  from 
the  sequence  GexC£  -  1;  F)  by  adding  the  minimal  number  s^  e  Z^ 
such  that  (s^,  +  +...+  (t  F  for  any  £'s  £  .  in  particular 

(2)  Gex(t';  F)  S  Gex(t;  F)  c  Gex(F)  V  £,  £'  6  Z^  |  £'  s  £  . 

In  the  present  paper  we  shall  study  the  following  two  functions 

(3)  gexd;  F)  :=  n(s)  ,  where  s  =  Gexf£;  F)  and 

(4)  exd;  F)  :=  min  (n(s)l  s  e  EXd;  F))  , 

realized  respectively  by  the  lexicographically  and  additive ly  minimal 
F-excluding  sequences  of  the  length  I  . 

2.  Extremal  sequences  with  forbidden  sums  and  the  extremal  graph 
theory.  We  can  obtain  a  natural  generalization  if  we  replace  a  set 
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of  the  forbidden  sums  f  by  a  set  of  the  forbidden  intervals  5  . 

But  the  most  of  the  results  given  below  will  not  generalize  this  case. 
If  9  =  Jff 1  : =  < s I  nCsi  e  f)  then  we  obtain  our  problem. 

Note  an  apparent  analogy  between  ex(t\  9)  and  the  function  with 
the  same  name  studied  by  extremal  graph  theory;  see  for  example  111. 
Then  the  function  exCt,  F)  is  analogous  to  that  considered  in  121. 

3.  Basic  results. 

Theorem  1.  Both  functions  ex  and  gex  are  uniform,  that  is 

(5)  ex(  it;  iF)  =  i  ex(t;F),  gex(  it;  iF)  =  i  gex(t;F)  V  i,  t,  F, 

where  iF  :=  (ia,,ia^ . ia  )  provided  F  =  fa,,a_ . a  )  ;  see  §7. 

Theorem  2.  The  ratio  gex  to  ex  can  be  estimated  by  inequalities 

(6)  1  s  gex(t;  F)/ex(t;  F)  s  («F  +  l)/2  V  t,  F, 

which  can  not  be  sharpened  for  any  #f.  In  particular,  if  =  1  then 
the  equation  gex(t;F)  =  ex(t;F)  holds  and  if  #F  >  1  then  the  second 
inequality  in  (6)  can  be  replaced  by  the  strict  one;  see  §§  4.2,  4.7. 

Theorem  3.  For  any  F  the  sequence  Gex(F)  is  quasiperiodical, 
that  is  some  its  infinite  interval  is  periodical;  see  §  8. 

The  symmetrical  sets  defined  by  the  condition 

(7)  3  n  =  n(F)  |  aef  o  n  -  a  e  F  . 

will  play  an  important  role.  For  the  symmetrical  sets  we  shall  give 

an  asymptotically  exact  estimation  of  the  functions  gex  and  ex  , 
in  other  words  we  shall  determine  the  limits 

(8)  m  (F)  =  Urn  (ex(t;  F)/t)  and 

n>^(F)  =  lim(gex(t;  F)/t)  for  f  -»  oo  . 

Theorem  4. The  limits  (8)  exist  for  any  F  .  In  the  case  of 
a  symmetrical  F  they  are  realized  by  two  periodical  sequences 
such  that  in  each  one  the  sum  of  all  the  numbers  in  each  period 
is  equal  to  n(F)  ;  see  §  6.1. 

In  particular  condition  (7)  holds  for  =  1  and  #f  =  2  . 
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Theorem  5.  For  any  set  F  there  exists  a  symmetrical  set  F' 
(respectively  FM  such  that  F  Q  F'  (respectively  F  Q  FM , 
functions  exit,  F)  and  ex(t,  F' )  (respectively  gex(t,  F)  and 
gex(t,  F'^))  are  asymptotically  equivalent  and  realized  by  the  same 
periodical  sequence.  Moreover  n(F')  is  always  equal  to  the  sum 
of  some  two  different  numbers  from  F  . 

But  n(F'^)  can  be  much  greater;  see  §  4.6. 

4.  Examples.  4.1.  For  the  set  F  =  (1,2 . j)  we  obtain 

Gex(F)  =  (  j+1.  j+1 _ )  =  0+1^“;  Gex(t,  F))  =  (  j+l/;  £X=  (Gex)-, 

ex(t,  (1,2 . j))  =  gexd  ;  (1,2 . j))  =  U j  +  1)  ; 

m(l,2 . j)  =  m^(l,2 . j)  =  j  +  1  ; 

ex( it  ;  ( i,2i, . . . .ji))  =  gex( it  ;  ( i,2i, . . . , ji) )  =  it( j  +  U  . 

The  last  formula  demonstrates  that  the  functions  ex  and  gex 
are  uniform;  see  §§  3,7.  In  particular  for  i  =  j  =  1  we  obtain 
ex(t,  (t))  =  gex(t;  (t))  =  2t  ;  Gex(t,  (t))  =  (l)^'ht  +  U  . 

4.2.  Let  F  =  (i)  be  the  single-element  set.  Then 
ex(t;  (i))  =  gex(t;  (i))  =  2i  (//ij  +  t  (mod  i)  ; 

Gex((i))  =  +  ;  m(i)  =  m  (i)  =  2  . 

g 

4.3.  Let  F  be  an  arbitrary  set  of  odd  numbers.  If  1  e  F  then 
Gex(F)  =  (2,2,...)  =  (2)”  ;  ex(t,  F)  =  gex( t;  F)  ~  2t  V  t  . 

In  any  case  m(F)  =  m^CF)  =  2  .  The  exclusion  of  odd  sums  is 
analogous  to  the  exclusion  of  subgraphs  which  chromatic  numbers 
are  not  less  than  3;  see  Ill. 

4.4.  The  functions  ex  and  gex  can  be  different  and  even 
asymptotically  not  equivalent.  The  simplest  examples  are  given 
by  the  sets  (4,7),  (4,9),  (5,8). 

Gexf4,7J  =  (1118)®=  (1^  8)®  ,  m^(4,7;  =  11/4  ; 

Gexd.Q)  =  (1=*  5^)®,  Gex(5,8)  =  (1*  9)®,  m  (A, 9)  =  m  (-5,81  =  13/5. 

g  g 

At  the  same  time  m(4,7)  =  11/5  ,  fflC4,9)  =  m('5,8)  =  13/6. 
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The  functions  ex(l\  ex(t\  ('4, 9);.  ex(l-,  (5,8)) 

are  asymptotically  realized  respectively  by  the  periodical  sequences 
(1233  2)".  (12323  2)"  (121333)“ 
which  contain  no  intervals  with  forbidden  sums.  Note  that 

ex(5i;  (i.7))  =  Hi  ;  ex(6i;  14, 9H  =  exlBi;  IS.SH  =  13i  Vi. 

4.5.  According  to  Theorem  3.  for  any  set  F  the  sequence  Gex(F) 

is  quasiperiodical,  that  is  Gex(F)  =  ('s°('rH('s^lf H“  and  the  initial 
interval  can  be  not  empty.  For  example. 

06X12, 4,71  =  (1  5)(3)“  ;  Gexl3.6,101  =  (1  1  7)(4  144)“; 

Gex(2.11,12)  =(13131  5)(4  4141314  1)“. 

Introduce  the  notations  n  (F)  =  n(s  (F)),  t  IF)  =  t(s  (F)).  Then 

g  g  g  g 

m  IF)  =  ntis  IF))  =  ms  (F))/t(s  IF))  =  n  (F)/l  IF). 

g  g  g  g  g  g 

4.6.  The  sequence  Gex  can  have  a  "very  long"  period.  For  example, 
Gexl3.7,10)  =  ((1  1  4)  8  (1  4  1)  8  (4  1  1)  11)“,  m  13,7,10)  =  15/4; 

g  ’  ’ 

Gexl3.8.11)  =(11419411491411  12)“,  m  13,8,11)  =  18/5; 

s 

Gexl4.9.13)  =  ((1^  5)  10  (1^  5  1)  10  (1  5  1^)  10  (5  1^)  14)“  ; 
Gexlj,  2Ji  +  1,  2ji  +  j  +  l)  =  (((l^"^  (j  +  l))*  (2ji  +  2) 

(l'^"^(j  +  1)  1)‘  (2ji  +  2)...(lj'"’  (j  +  1)  (2ji  +  2)... 

;i  (j  +  1)  lj'^)‘  (2ji  +  2)  ((j  +  1)  ij'M*  (2ji  *  j  +  2))“  = 

~  <<nJi  ((!'’■'"  (J  +  1)  i'"*h*  (2ji  ^  2)))  @  j  )”, 
where  the  sign  ®  means  that  j  is  added  to  the  last  number 

of  the  word.  Thus  for  F  =  ij,  2ji  +  1,  2ji  +  j  +  1)  we  obtain 

n^lF)  =  jl4ij  +  3).  l^(F)  =  j(ij  +  1),  m^lF)  =  4  -  1/lij  +  1). 

In  general  we  have  no  good  upper  estimate  for  n  IF)  and  I  IF)  . 

g  g 

If  the  set  F  is  symmetrical  then  n  IF)  is  a  divider  of  nlF). 

g 

4.7.  "Big  ratios"  gex/ex  are  realized  by  the  arithmetic 

progressions  F  =  12i,  4i  -  1,  6i  -  2 . 2ji  -  j  +  1)  that  have 

the  length  #F  =  j  ,  the  step  Zi  -  1  and  the  initial  number  Zi  . 

Then  Gex(F)  =  (l^‘"'(2ji  -  j  +  2))“  and  exl£;  F)  is  asymptotically 
realized  by  the  periodical  sequence  (1  2  (2*"^  3)“^  2‘'h“.  Thus 
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n^(F)  =  n(F)  =  2ij+2i-j+l  .  e^(F)  =  21  ,  KF)  =  ij+i-j+1  ; 

m  (F)/m(F)  =  l(F)/l  (F)  =  (J-H)/2  -  ( j-l)/(2I)  ; 

g  g 

gexCnQ-,  F)  /  ex(n^-,  F)  =  (j*\)/2  -  (j-\)/(2i)  where 
rijj  =  e^(F)  t(F)  n(F)  =  2i~( )  (2ij+2i-j*l)  ; 

0  +  1^2  -  (j-l)/(2i)  (j+l)/2  provided  i  »  ;  j  =  #F  . 

4.8.  Let  us  explain  some  notations  of  §§  4.  1-4.7.  The  sequences 
Gex  and  Ex  e  EX  are  given  as  words  in  the  alphabet  . 

Some  intervals  are  marked  by  the  parentheses.  The  power  i 

over  a  parenthesis  means  that  the  corresponding  interval  is  repeated 

i  times.  The  power  over  the  last  parenthesis  can  take  the  value  »  . 

5.  Minimal  F-excluding,  absolutely  F-excluding  and  F-critical 
sequences;  the  asymptotic  realization  of  the  function  ex 
by  an  infinite  periodical  sequence. 

5.1.  A  finite  F-excluding  sequence  s  will  be  called  minimal 
(mFe)  if  it  realizes  the  value  of  the  function  ex  ,  that  is  if 
n(sj  =  ex(l(si;  F). 

Lemma  1.  Any  mFe  sequence  s  realizes  a  lower  estimate 
m(F)  i  m(s)  =  n(s)/t(s)  . 

Proof  is  based  on  the  following  evident  but  important  inequality 
(9)  ex(  it,  F)  ^  i  exit,  F)  V  i.  1.  F. 

Thus  for  any  i,  t,  F  we  obtain 

ex(  it,  F)/(il)  s  i  exit,  F)/Ht)  -  ex(  £;  F)/£  =  nis)/iis)  =  mis). 

5.2.  A  finite  sequence  s  will  be  called  absolutely  F-excluding 
(aFe)  if  the  infinite  periodical  sequence  is)”  is  F-excluding. 

Any  aFe  sequence  is  F-excluding  but  not  vice  versa. 

Lemma  2.  Any  aFe  sequence  s  realizes  an  upper  estimate 
miF)  s  mis>  =  nis)/tis)  . 

Proof.  For  any  i,  I,  F  we  obtain  exiiis);  F)  s  nfsl; 
exiitis);  F)  s  infsl;  exiitis);  F)/iieis))  s  ini s)/i itis) )  =  mis). 
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Lemma  3.  The  following  eight  properties  of  the  finite  sequence  s 
are  equivalent: 

f)  s  is  afe  ;  f')  (s)“  is  f-excluding; 

b,  b' )  is  afe  for  any  (for  some)  i  e  ; 

c,  c')  the  sequence  o-fs)  is  afe  for  any  (for  some)  cyclic 
permutation  <r  of  the  sequence  s  ; 

d)  the  inverse  to  s  sequence  s  is  afe  ; 

e)  for  any  number  a  6  f  and  interval  s'  £  s  the  following 
inequalities  hold:  n(s')  *  a  (mod  n(s))  and  n(s')  *  (-a)  (mod  n(s)). 

5.3.  A  finite  sequence  will  be  called  F-critical  if  it  is 
mfe  and  afe  simultaneously.  Lemmas  1  and  2  result  in 

Proposition  1.  For  any  f-critical  sequence  s  the  equation 
m(F)  =  m(s)  holds.  Thus  s  realizes  the  asymptotically  exact  estimate 
of  the  function  ex(t,  F). 

5.4.  Let  s  be  an  arbitrary  afe  sequence  such  that  n(s)  =  n(F). 
Then  as  a  rule  the  sequence  (s)^  will  be  f-critical  for  some  i  €  Z^. 
For  example  the  sequence  (3  3  4)‘  is  f-critical  for  the  set 

f  =  (1,8,9)  if  i  =  5  ,  but  not  for  i  s  4. 

A  PC-program  of  computations  of  m(F)  was  realized  by  S.  Tarassov. 
This  method  is  rather  efficient  but  unfortunately  sometimes  it  fails. 
The  simplest  example  is  given  by  the  set  f  =  (1,2,6,11).  In  this  case 
there  are  infinitely  many  afe  sequences  that  realize  the  value 
m(F)  =  4  ;  for  example,  (4)“,  (3  4  5)*,  (3  5  4)“,  (3435  5)“, 

( (3  4  3) (5  4  3) ‘ (5  4  5) )“  V  1  e  Z^  etc.;  see  Lemma  3.  However  neither 
these  sequences  nor  their  powers  are  not  mfe.  Really  let  us  consider 
infinite  quasiperiodical  sequence  (3  4  3) (5  4  3)“.  Inequality  a(s)  <  4 
holds  for  any  initial  interval  of  s  .  For  the  considered  set 
f  =  (1,2,6,11)  there  exists  no  f-critical  sequence  at  all. 

5.5.  However  according  to  Theorem  5  the  value  m(F)  can  be 
determined  by  the  following  formula 
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m(F)  =  min  (m(F')\  F' :  F  Z  f'e  SYH  .  n(F')  s  2max(a\  a  e  F)). 

In  other  words,  the  minimum  of  m(F')  is  taken  over  all 

the  symmetrical  sets  F'.  that  contain  the  set  F  and  satisfy  to 

the  inequality  n( F' )  s  2aax(la|a  e  F). 

In  the  next  section  we  shall  see  that  it  is  not  difficult 
to  determine  the  values  m  and  m^  in  the  case  of  symmetrical  sets. 

6.  Synunetrical  sets  of  the  forbidden  sums. 

6.1.  Let  F  be  a  symmetrical  set;  see  (7).  Consider  the  set 
of  all  the  sequences  which  sums  are  equal  to  n(F)  ,  that  is 

SN(F)  =  {s|  nts)  =  n(F)). 

Lemma  4.  A  sequence  s  e  SN(F)  is  F-excluding  if  and  only  if 
it  is  aFe. 

Denote  the  corresponding  subset  of  SN(F)  by  SNE(F).  Chose  in 
SNE(F)  the  lexicographically  minimal  sequence  and  an  arbitrary 
sequence  s  of  the  maximal  length.  Let  i(F)  :=  tXs),  ^gCF)  := 
Evidently  1(F)  i  UF). 

Proposition  2.  The  sequence  s  is  F-critical  and  Gex(F)  =  . 

This  statement  results  in  the  equations 

(10)  m(F)  =  n(F)/l(F)  ,  m  (F)  =  n(F)/e  (F)  . 

8  8 

Thus  the  periodical  sequences  and  asymptotically 

realize  the  functions  ex  and  gex  respectively.  Note  that  by 

the  definition  n(s)  =  nis  )  =  n(F)  in  accordance  with  Theorem  4. 

8 

6.2.  Let  N(F)  :=  U.  2,  ....  n(F)  -  1)  and  G  :=  N(F)  -  F  . 
Evidently  the  set  G  is  also  symmetrical  and  N(F)  =  N(G),  n(F)  =  n(G). 

Theorem  6.  For  any  symmetrical  set  the  following  inequality  holds 

(11)  n(F)  =  n(G)  s  KF)  t(G)  . 

Formulae  (10)  and  (11)  result  in 

(12)  m(F)  =  n(F)/i{F)  i  t(G)  .  m(G)  =  n(G)/t(G)  s  t(F)  ; 

(13)  m(F)  m(G)  =  n^/dCF)  KG))  £  n(F)  =  n(G). 
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Proposition  3.  For  any  set  F  we  obtain 

(14)  m(F)  =  m  (F)  =  \  if  F  =  a;  2s  m(F)  £  m  (F)  if  F  *  0. 

8  8 

Proof  for  a  symmetrical  F  immediately  results  from  (12). 

6.3.  Any  set  F  such  that  #F  s  2  is  symmetrical.  The  case  #f  =  1 
was  already  considered  in  §  4.2.  Let  F  =  (i,  j)  ,  then  n(F)  =  i  +  J  . 
According  to  Theorem  1  both  functions  ex  and  gex  are  uniform. 
This  statement  results  in  the  following  equations 

(15)  m(i,  J)  =  m(Jk,  Jk),  "igC-i.  J)  =  nt ik,  jk)  V  i,  j,  k  e  , 

Then  m(i,  j)  =  j)  =  2  ,  if  both  numbers  i  and  j  are  odd: 

see  §  4.3.  Thus  without  a  loss  of  generality  we  can  assume  that 

f)  j  i  i  ;  b)  GCDfi.j)  =  1  ;  c)  i  is  even,  j  is  odd  or  vice  versa. 
Proposition  3.  The  following  formulae  hold 
n(  i,j)  =  f’gC-i.  j-)  =  ; 

i(i,J)  =  L('i+jV2J  ;  l^(i,j)  =  [Ci+jl/ZJ  -  [( j-i  )/2j(  man  i)  , 
where  a(mon  b)  ;=  min  (bCmod  a),  -(b+DCmod  a))  e  <0,1,...,  fa/21-1}  ; 
nt(i,J)  =  n(  i,  J)/t(  i,  j)  =  ('i+j)/ [C  j+j.)/2J  =  2  +  l/Lfi+jVZJ  ; 

=  n^(i,J)/l^(i,J)  =  ( i+j )/(  l(i+J  )/2j  -  [( j- i  )/2j(  man  i))  ; 
m^(  i,j)/m(  i,J)  =  1  /  fl  -  f  f  fC  j-i  V2J('nion  i))/  l(i  +  J)/2i)  <  3/2  . 

The  upper  estimate  3/2  can  not  be  sharpened;  see  §  4.7. 

The  important  peculiar  property  of  the  case  #F  =  2  is 
the  uniqueness  of  the  F-critical  sequence.  More  exactly 
Proposition  4.  For  any  i  ,  j  there  exists  the  unique 
(up  to  the  cyclic  permutations)  ("i, j)-excluding  sequence  s  that  has 
the  length  l(s)  =  l(i*j)/2j  and  the  sum  n(s)  =  i  +  j  . 

7.  Proof  of  Theorem  1  (sketch).  Fix  an  arbitrary  sequence 

s  =  (Sj,  .  s and  number  i  6  .  Consider  the  sequence 

si  :=  +  U  ni‘”Vs2i-i  +  l)  ...  nV'Vs^i-i  +  ll). 

Evidently  t(si)  =  i  Ks)  ,  n(si)  =  i  n(s)  ,  rnCsi)  =  m(s). 

The  sequence  s  is  a)  f-exciuding,  b)  mFe,  c)  lexicographically 
minimal  F-excluding  if  and  only  if  the  sequence  si  is  respectively 
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a')  if-excluding,  b' )  in(lF)e,  c')  lexicographically  minimal 
if-excluding.  Let  us  prove  for  example  the  implication  a)  =>  a' ) . 

For  any  j  =  1,2 .  t  the  following  equations  and  inequalities  hold 

s,i-i  +  l  >  iCs-l)  .  +  2(1-1)  =  ifs.+  n-l  <  jCs.tU. 

J  j  j  j  J 

Thus  if  for  an  interval  s'  £  si  the  sum  nis')  is  a  multiple  of 
I  ,  then  s'  =  si"  ,  where  s"  is  an  interval  of  s  .  But  n(s")  i  F 
because  the  sequence  s  is  f-sxcluding.  Thus  ni  s' )  =  ints")  f?  if  . 

Consider  also  the  implication  b)  ^  b').  The  sequence  s  is  mfe. 
We  shall  prove  that  si  is  m{ ifle.  It  is  proved  already  that  si 
is  if-excluding.  Assume  that  it  is  not  minimal,  that  is  there  exists 
a  sequence  s'  such  that  ms'  )  =  ii  and  n(s')  <  ;n(si.  Consider 
all  the  initial  intervals  £  s'  and  numbers  nts'Kmod  i)  ; 

J 

j  =  1,  2,  . . i£.  There  are  :i  numbers  which  take  only  i  values. 
There  are  two  possibilities;  either  each  value  occurs  exactly  t  times 
or  there  is  a  value  that  occurs  not  less  then  £+1  times.  In  both  cases 
in  s'  we  can  outline  £  successive  disjunctive  intervals  which  sums 
are  multiple  to  i  .  Replace  each  interval  by  the  corresponding  sum. 

We  obtain  a  sequence  s"  such  that  i(s")  =  t  and  nis")  <  n(s). 
Moreover  the  sequence  s"  is  f-excluding  because  s'  was 
(" if  1-excluding.  But  the  sequence  s  was  mfe  .  Contradiction 

8.  Proof  of  Theorem  3.  Introduce  the  set  of  sequences 

(16)  S(F)  =  is  I  n(s)  2  max  (a|  a  e  f)  >  n(s',)  V  s'  £  s) . 

It  is  finite.  At  the  same  time  the  sequence  Gex(F)  contains 
infinitely  many  intervals  from  S  .  Consequently  Gex(F)  contains 
two  equal  intervals  s'=  s".  Then  the  interval  between  the  beginnings 
of  s'  and  s"  is  the  period  of  Gex(F).  It  will  be  clear  if  we 
compare  the  definition  of  S{F)  by  (16)  and  the  definition  of  Gex(F) 
by  the  greedy  algorithm;  see  §  1. 

The  author  thanks  S.  Tarassov  for  the  PC-program  computing  ex  . 
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1. Introduction 

The  use  of  the  interior  point  method  0PM)  for  the  solution  of  linear  programming  0-P) 
problems  provides  a  number  of  benefits  which  are  summarized  below.  For  large  or  highly 
degenerate  LPs  IPM  is  usually  faster  than  the  simplex  solver.  Whereas  simplex  based 
algorithms  require  considerable  adaptation  and  control  parameter  tuning  from  one  model  class 
to  another,  default  settings  of  IPM  are  sufficient  to  process  a  wide  class  of  LPs.  IPM  is  not 
only  robust  in  this  way,  its  progress  is  not  hindered  by  the  degeneracy  or  the  stalling  problem 
of  the  simplex;  indeed  it  reaches  the  "near  optimal"  solution  very  quickly.  Simplex 
algorithms,  in  contrast,  are  not  affected  by  the  boundary  conditions  which  slow  down  the 
convergence  of  IPM.  The  fast  initial  convergence  of  IPM  to  a  near  optimal  solution  can  be 
followed  up  by  the  superior  near  optimal  to  optimal  convergence  of  simplex  algorithms  to 
create  an  hybrid  IPM-simplex  system. 

In  this  paper  we  review  the  current  use  of  IPM  for  the  solution  of  integer  programming  (IP) 
problems.  We  extended  our  hybrid  IPM  simplex  system  (Levkovitz  et  al.  92)  and  investigate 
the  role  of  the  IPM  within  a  simplex  based  branch  and  bound  (B&B)  algorithm  to  discrete 
programming.  An  IPM  based  heuristic  is  developed  whereby  the  IPM  search  includes  a  non¬ 
integrality  penalty  for  the  discrete  variables.  The  IPM  solution  is  than  used  to  determine 
an  alternative  tree  search  criteria  for  a  simplex  based  B&B  algorithm. 

The  rest  of  the  paper  is  organized  as  follows.  In  section  2  we  review  the  use  of  IPM  for 
solving  IP  problems  within  the  branch  and  bound  algorithm.  In  section  3  we  outline  our 
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proposed  heuristic  for  finding  a  good  starting  IP  solution.  In  section  4  we  describe  the 
integration  of  the  IPM  solution  in  the  B&B  algorithm. 

2.  Interior  Point  Method  in  the  B&B  algorithm 

Since  the  branch  and  bound  (B&B)  algorithm  creates  LP  sub-problems  that  are  closely 
related  to  each  other,  a  sub-problem  is  solved  by  using  an  optimal  basis  of  a  previously 
solved  problem.  In  this  way,  the  solution  of  a  sub-problem  requires  only  few  simplex 
iterations.  A  similar  efficient  warm  start  procedure  for  IPM,  however,  is  not  yet  available 
(Lustig  et  al.  90). 

The  use  of  IPM  in  the  solution  of  IP  problems  was  examined  by  several  researchers  notably 
Karmarkar  et  al.  (89)  Kamath  et  al.  (89,  91)  Mitchell  and  Todd  (91),  and  Mitchel  and 
Borchers  (91).  We  identify  several  theoretical  advantages  of  IPM  when  used  in  the  context 
of  the  B&B  algorithm.  The  key  advantage  of  1PM  is  the  ability  to  reach  a  near  optimal 
solution  or  to  determine  in  a  relatively  short  time  that  such  a  solution  does  not  exist.  If  a 
primal  dual  algorithm  is  used,  the  near  optimal  solution  also  provides  an  upper  and  a  lower 
bound  for  the  optimal  solution  at  every  iteration.  A  sub-problem  can  be  abandoned  (and 
consequently  the  branch  of  the  tree  in  B&B)  if  the  problem  is  found  to  be  infeasible  or  if  the 
previously  found  integer  solution  is  lower  than  the  dual  bound.  Further,  the  quick 
convergence  to  the  solution  means  that  individual  variables  reach  their  near  optimal  value 
quickly  and  then  converge  steadily  to  the  optimal  one.  This  allows  the  use  of  indicator 
functions  to  distinguish  between  the  dormant  variables  which  converge  to  their  bounds  (to 
an  integer  solution)  and  the  active  variables  which  remain  fractional  (El-barki  et  al  91).  The 
smooth  convergence  of  variables  to  their  optimal  value  mean  that  a  convergence  to  a 
fractional  value  (active  variable)  can  be  discovered  early  in  the  iterative  process.  Using  this 
information  we  can  decide  if  the  problem  should  be  solved  to  optimality  or  terminated  early. 
Another  important  advantage  of  primal  dual  IPM  is  the  generation  of  the  solution  points  by 
following  the  path  of  centres,  (Levkovitz  92).  The  path  of  centres  can  be  perturbed  in  a 
way  that  will  attract  it  to  feasible  integer  lattice  points. 

Consider  the  following  pure  0-1  integer  programming  problem: 
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mm  c'x 
s.t.  Ax-b 
Oixiu 

x^e{0,l).  ^j€N, 


(2.1) 


where  N,  ^  the  set  of  indices  of  the  binary  variables. 

The  linear  relaxation  of  this  problem  and  its  corresponding  dual  problem  are: 


c^x 


Primal: 

Min 

s.  t.  Ax=b 

X+S=U,  X,SiO 


Dual: 

Max 
s.  t. 


b  ^y-u 
A  ^y+z-w=c 


(2.2) 


where,  iieR""*,  x,s,w,z,c,ueR'',  y.deiR". 


Borchers  and  Mitchell  (91)  solves  (2. 1)  by  applying  a  specially  designed  B&B  algorithm 
which  utilizes  the  primal  dual  predictor  corrector  1PM  to  solve  the  generated  sub¬ 
problems.  The  algorithm  starts  by  solving  the  LP  relaxation  of  (2.2).  The  optimal 
solution  of  this  problem  provides  a  lower  bound  for  the  best  integer  solution  that  can  be 
found.  The  algorithm  then  chooses  the  first  sub-problem  by  using  a  ’depth  first’  search 
tree.  Other  sub-problems  are  chosen  according  to  their  estimated  objective  function  value 
such  that  the  sub-problems  with  lower  estimation  are  solved  first. 

For  every  sub-problem,  the  following  operations  are  carried  out: 

Algorithm  2.1 

1.  Restart  the  sub  problem  from  a  previously  saved  IPM  solution. 

2.  Solve  the  LP  sub  problem  until  ^  **'<0.1 

\b^y-u 

If  irtfeasibility  is  detected  in  the  process  then  STOP 

3.  If  b’^y+u^w  >  t  where  t  is  the  best  integer  solution  found  so  far  then  STOP 

4.  Declare  which  variables  are  fractional  by  using  the  following  indicators:  let  k+ 1 
be  the  curreru  iteration  of  the  IPM  algorithm  then  if  one  of  the  following 
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conditions  holds  the  0,1  variable  is  declared  fractional: 

x**‘  j**'  z**‘ 

|^-ll<0.1.  -^<0.6,  -L^<0.6. 

Xj  Sj  Wj  Zj 

Mitchell  and  Borchers  implemented  this  algorithm  and  compared  it  to  an  alternative 
algorithm  that  uses  cold  start  IPM  and  solves  every  sub-problem  to  completion.  These 
two  algorithms  are  also  compared  to  the  IP  module  of  IBM/OSL  (Forrest  and  Tumlin  90). 
The  preliminary  results  show  that  although  the  algorithm  presented  above  is  superior  to 
an  algorithm  that  uses  IPM  in  a  simple  form,  it  is  still  inferior  to  the  OSL/IP  algorithm. 
Analysis  of  the  results  show  that  the  performance  gap  is  created  by  a  combination  of  a 
longer  processing  time  for  sub-problems  and  a  larger  number  of  sub-problems  processed 
in  the  IPM  algorithm. 

3.  An  IPM  based  heuristic  for  finding  an  integer  starting  point 
In  our  investigations,  we  use  the  primal  dual  IPM  in  a  heuristic  to  find  a  feasible  or  a 
near  feasible  starting  point  for  pure  01  IP  problems.  This  method  is  designed  for 
inclusion  in  a  combined  IPM/B&B  algorithm  where  the  primal  dual  iteration  data  is  used 
as  decision  criteria  in  the  B&B  framework.  In  our  method  we  try  to  encourage  the  binary 
variables  to  converge  to  their  bounds  by  augmenting  the  objective  function  with  a  penalty 
function  on  the  binary  variables  that  remain  fractional.  In  addition,  to  avoid  the  stalling 
boundary  behaviour  of  IPM,  the  variables  are  shifted  such  that  the  required  optimal 
solution  is  in  the  interior  of  the  LP  feasible  region. 

By  this  approach,  the  movement  of  a  variable  towards  the  binary  solution  can  be  detected 
and  the  binary  variables  can  be  fixed  to  their  appropriate  value  earlier.  The  heuristics 
fixes  the  binary  variables  one  by  one  while  attempting  to  maintain  primal  feasibility. 
Consider  the  pure  01  IP  problem  presented  in  (2.1).  We  reformulate  the  problem  in  the 
following  way  :  Let  Osijil,  be  the  relaxation  of  the  binary  variables 

x^6{0,l|,  JeNf.  These  variables  are  shifted  by  e>0  in  the  following  way: 
yfeW/  *•  €iXySl+€. 


(3.1) 


261 


This  requires  the  update  of  the  right  hand  side  values 


j€N, 

otherwise 


(3.2) 


The  bounds  of  the  new  variables  are  further  relaxed  such  that 

lj=0,  11^=1 +2x6,  6>0.  (3.3) 

It  is  clear  that  a  solution  to  the  relaxed  problem  is  feasible  for  the  original  problem  if  the 
variables  that  represent  the  binary  variables  are  fixed  to  either  e  or  6  +  1 . 

To  achieve  this  end,  we  augment  the  objective  function  with  a  penalty  function  based  on 
the  sum  of  fractions: 

min  c^x+Mx^  (x^-€)x(l+€-xp, 

(3.4) 

where  big  M  is  a  weight  such  that  M->c 


The  solution  process  of  the  new  problem  is  described  in  algorithm  3.1; 

Algorithm  3.1:  IPM  heuristic  for  finding  an  IP  solution 

Unitialize:  Calculate  the  first  feasible  point  for  the  LP  relaxation  problem  in  (2.2). 

2.  Execute  IPM  (predictor  corrector). 

If  (all  the  binary  variables  are  fixed  or  infeasibility  is  detected)  then  STOP 

3.  Find  a  variable  for  fixing  veG:  x^=rjin(min{  |x.-e|),min|  |l+e-x.)) 

jeG  jeG 

4.  Remove  the  variable  from  the  problem 

if  |x^-€|s|l+€-x,|  then  x^=€  else  x,=l+6 

set  G=G\{vJ,  i4=i4\{ay),  b=b-exaj,  c=c\{c^) 

5.  Reinstate  the  dual  problem 

6.  Go  to  step  2 

The  procedure  described  in  algorithm  3.1  continues  until  all  the  integer  variables  are 
fixed  or  if  it  is  established  that  the  heuristic  does  not  lead  to  an  integer  feasible  solution. 
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In  both  cases  the  last  feasible  solution  is  converted  to  simplex  using  the  basis  retrieval 
procedure.  Preliminary  results  of  our  heuristic  on  a  small  set  of  problems  with  binary 
variables  are  presented  in  Table  3.1  The  first  four  columns  give  the  model  name, 
number  of  rows,  number  of  columns  and  number  of  binary  variables  respectively.  The 
columns  marked  ’LP  sol’  and  ’IP  sol’  give  the  optimal  LP  solution  and  the  best  integer 
solution  respectively.  The  column  marked  IPMH  presents  the  result  of  running  the  IPM 
heuristic  to  termination.  The  last  column  in  the  table  indicates  the  sum  of  infeasibilites 
where  10*  is  the  feasibility  tolerance.  These  results,  although  preliminary,  indicate  that 
the  use  of  IPM  as  an  alternative  heuristic  to  generate  a  first  feasible  integer  solution  in  the 
branch  and  bound  procedure  offers  some  advantages. 


Table  3.1  Result  of  IPM  integer  heuristic 
4.  Using  IPM  within  B&B  to  Construct  a  Good  Starting  Solution 
The  integration  of  IPM  and  simplex  has  become  a  leading  applied  research  topic  in  the  area 
of  large  scale  LPs,  see  for  example  (Forest  and  Tomlin  90)  and  (Levkovitz  92).  The 
integration  at  the  level  of  IP  tree  search  with  IPM,  however,  has  not  been  considered  or 
reported  so  far.  In  section  2,  we  noted  that  the  results  reported  by  Borchers  and  Mitchell  (91) 
are  not  comparable  to  those  of  OSL,  mainly  because  IPM  cannot  be  efficiently  warm-started 
to  re-optimise  a  series  of  sub-problems.  Taking  advantage  of  our  basis  recovery  procedure 
mainly  from  IPM  to  simplex  and  also  making  use  of  a  fixing  agenda  (fix-mix)  given  by  the 
discrete  solution  obtained  by  IPM  heuristic,  we  have  created  the  following  procedure  for 
constructing  a  partially  specified  search  tree  (Hajian  92). 
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The  choice  of  a  variable  which  is  partitioned  to  discrete  values  for  creating  the  corresponding 
subproblems  and  the  choice  of  a  sub-problem  taken  together  determine  the  structure  and  size 
of  a  search  tree.  We  added  to  our  e^risting  B&B  solver  an  algorithmic  facility  by  which  a 
part  of  the  search  tree  is  created  rapidly  from  an  initial  agenda  of  the  global  entities.  Such 
an  agenda  may  be  obtained  from  various  sources  such  as:  (i)  an  existing  schedule  in  case  of 
scheduling,  (ii)  applying  an  IPM  heuristic  described  in  section,  (iii)  rounding  off  a  quasi 
integer  solution,  or  by  any  other  means. 

A  given  agenda  does  not  necessary  provide  an  integer  feasible  solution  but  it  has  to  be  a 
near  feasible  integer  solution  in  order  to  provide  any  benefits.  A  B&B  code  which  is 
designed  to  take  advantage  of  a  partial  or  a  full  agenda  of  discrete  variables  may  perform 
more  efficient  tree  search  than  the  one  without  such  a  facility.  !t  is  obvious  that,  using  a 
given  integer  feasible  solution  in  a  B&B  saves  the  time  spent  on  the  heuristic  for  obtaining 
the  first  integer  solution.  We  call  this  procedure  fix-mix  which  is  equivalent  to  warm-starting 
the  B&B  algorithm.  To  fix-mix  procedure  is  described  in  algonthm  4.1.  In  this  algorithm 
f  ,p  is  the  value  of  a  possible  integer  feasible  solution  (if  one  gained),  t',p  is  the  given  integer 
feasible  (or  near  feasible)  solution  in  the  fix-mix  agenda. 

Algorithm  4.1  Fix-Mix  (H^jian  92) 

0.  Initialize:  Solve  the  first  LP  relaxation  problem.  If  the  solution  is  integer  feasible  then 
STOP. 

else  set  the  lower  bound  on  the  objective  function  to  /,p  (if  one  exists). 

1.  Fix:  If  the  fix-mix  agenda  is  empty  then  go  to  step  4,  else  choose  a  variable  from 

the  fix-mix  agenda  and  fix  it  to  the  appropriate  value. 

2.  Store  Create  a  subproblem  by  fixing  the  variable  in  previous  step  to  its 

complementary  value  and  store  the  subproblem  together  with  the  current  basis 
in  the  stack. 

3.  Solve  Solve  the  current  subproblem.  If  (the  solution  is  integer  feasible  and  the 

objective  function  value  f,p:  >  f,p  )  or  (the  solution  is  not  integer)  then 

go  to  step  I. 

4.  Execute  the  standard  BAB  algorithm. 

The  flow  chart  of  Algorithm  4.1  is  illustrated  in  figure  4.1 
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Figure  4, 1  The  B&B  algorithm  with  Fix -Mix 
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1  Introduction 

In  this  paper  we  consider  the  problem  of  maximizing  a  linear  function  over  a  convex,  compact  set, 
defined  by  convex  functions.  A  logarithmic  barrier  cutting  plane  algorithm  is  developed  for  solving 
such  problems.  Like  other  cutting  plane  methods,  a  linear  relaxation  of  the  convex  programming 
problem  is  generated  in  each  stage  of  the  algorithm.  Instead  of  solving  these  relaxations,  we  just  try 
to  follow  their  central  path  with  a  long-step  logarithmic  barrier  method.  If  the  next  iterate  would 
be  out  of  the  feasible  region  or  close  to  the  boundary,  then  we  do  not  make  the  step.  Then  we  add  a 
new  constraint  in  the  point  foreclose”  to  it)  where  the  step  violates  a  constraint.  Obviously,  this  way 
we  combine  Interior  Point  Methods  (IPM’s)  with  some  new  cutting  plane  (decomposition)  method. 
As  it  is  known  that  a  convex  programming  problems  is  in  fact  a  semi-infinite  programming  problem, 
therefore  our  algorithm  can  be  regarded  as  an  IPM  for  semi-infinite  programming  problems. 

The  first,  and  still  probably  the  most  popular,  cutting-plane  algorithm  for  convex  programs  was 
developed  by  Kelley  [11].  Here  an  LP  relaxation  is  solved  and  an  infeasible  point  is  generated  in  each 
step.  Kelly’s  method  does  not  work  well  in  practice.  The  so-called  “central  cutting  plane  methods" 
of  Elzinga  and  Moore  [2]  and  Goffin  and  Vial  [4]  are  considered  quite  efficient  (see  e.g.  Kortanek 
and  Ho  [14],  Goffin,  Haurie  and  Vial  (3)  and  Bahn,  Goffin,  Vial  and  Du  Merle  [1].  They  calculate 
a  certain  “center"  of  the  LP  relaxation.  If  the  center  is  feasible,  then  they  add  an  objective  cut,  if 
the  center  is  infeasible,  then  a  new  separating  hyperplane  is  generated.  Therefore  these  methods 
generate  feasible  and  infeasible  points  during  the  idgorithm,  although  this  might  not  be  the  case. 
Contrary  to  these  methods,  our  algorithm  remains  in  the  interior  of  the  feasible  set. 

The  third  area  to  be  considered  in  this  paper  is  the  theory  of  IPM’s.  Karmarkar  [12]  has  initiated 
the  explosively  developing  field  of  IPM’s.  These  methods  not  only  have  theoretical  interest,  but 
are  practically  efficient,  especially  for  large  and  degenerate  problems.  Jarre  [10],  Nesterov  and 
Nemirovski  [15]  and  Den  Hertog,  Roos  and  Terlaky  (7,  8]  generalized  logarithmic  barrier  methods 
to  smooth  convex  programming. 

In  [6]  a  build-up  strategy  for  the  long-step  logarithmic  barrier  method  for  LP  is  presented.  In  [9] 
the  effect  of  adding  and  deleting  constraints  in  the  logarithmic  barrier  method  for  LP  is  studied. 
These  algorithms  start  with  a  (small)  subset  of  the  constraints,  and  follows  the  corresponding 
central  path  until  the  iterate  is  close  to  (or  violates)  one  of  the  other  constraints.  In  fact,  to  be 

'This  work  is  completed  under  the  support  of  a  research  grant  of  SHELL. 
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more  precise,  if  some  slack  value  Si  satisfies  s,-  <  2"‘,  for  some  ’proximity’  parameter  t,  then  the 
i-th  constraint,  if  not  already  in  the  current  system,  is  added  to  the  current  system.  This  process 
is  repeated  until  the  iterate  is  close  to  the  optimum. 

As  far  as  notations  are  concerned,  e  shall  denote  the  vector  of  all  ones,  e,  the  t-th  unit  vector  and 
I  the  identity  matrix.  Given  an  m  x  n  matrix  A,  its  columns  are  denoted  by  a,,  t  =  1, . . .  n.  Given 
an  n-dimensional  vector  s  we  denote  by  the  transpose  of  the  vector  s  and  the  same  notation 
holds  for  matrices.  Finally  ||s||  denotes  the  I2  norm. 

We  will  consider  the  following  convex  programming  problem. 

(CP)  max  b^y 
s.t.  y  e 


where  y,6  €  R"'  and 

P  :={ye.  IR™  :/.{y)<  0,  1  <  i  <  n}. 

The  functions  /,(y),  0  <  t  <  n,  are  assumed  to  be  convex.  Without  loss  of  generality  we  may 
assume  that  the  objective  function  is  linear  and  we  also  a.ssume  that  |{6||  =  1  and  P  is  compact. 
Further  we  assume  that  the  interior  of  .F,  is  not  empty.  This  condition  is  equivalent  to  the 
Slater  condition  used  by  Elzinga  and  Moore  (2]. 


2  Central  Cutting  Plane  Algorithms 

The  essence  of  central  cutting  plane  algorithms  can  be  given  as  follows. 


Central  Cutting  Plane  Algorithm 


Input; 

P  is  a  convex  polytop  such  that  P  CP. 
z  is  lower  bound  for  the  objective  value  on  P. 

(  is  any  small  number,  the  stopping  tolerance. 

r  is  initially  a  large  number  (the  distance  of  the  center  to  the  boundary). 


begin 

while  <  <  r  do 

Compute  (approximately)  the  “center”  y^  of  the  polytope  Pn  {y  :  b^y  >  z}; 
r  :=  b^y‘  —  z; 

if  y'  ^  T,  then  Feasibility  Cut; 

else  Objective  Cut; 

endif 

end 

end. 


Objective  Cut 


begin 

z  ;=  6^y*,  the  new  objective  cut  will  be 

b^y  >  z. 

end. 


) 
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Feasibility  Cut 

begin 

fi(y‘)  >  0,  i-e.  /,  is  a  violated  constraint.  Add  the  cut 

MV‘)  +  V/.(y^)(y  -  y')  <  0 

to  the  current  system.  P  :=  V  n  {  y  :  ^ fi(y‘){y  -  y‘)  <  0} 

end. 

Modifying  Elzinga  and  Moore's  [2]  proof  of  convergence,  we  can  give  a  convergence  proof  for  Goffin 
and  Vial's  [4]  central  cutting  piane  method.  We  will  assume,  that  an  algorithm  is  avadiable  to  find 
the  appropriate  center  in  these  algorithms.  Up  to  now  in  the  implementations  of  the  adgorithm  of 
Elzinga  and  Moore  the  simplex  method  was  used  to  find  the  center  [14]  (using  the  simplex  method 
is  not  essential,  IPM’s  can  be  used  as  well),  while  Goffin  and  Vial  used  the  projective  method  [ij. 

The  convergence  proof  is  based  on  the  following  observation.  Sonnevend  [17]  proved,  that  for  any 
polyhedron  V  C  HI'"  there  exist  two  ellipsoids  E  and  E'  with  E'  =  mE  such  that  y'^  +  E  C  V  C 
y‘^  +  E',  where  y*"'  is  the  analytic  center  of  the  polytop.  Now  let  us  inscribe  a  sphere  into  V.  It  is 
obvious,  that  the  radius  of  the  largest  inscribed  sphere  is  larger  or  equal  than  the  smallest  axis  of 
E  and  not  greater  than  the  smallest  axis  of  E'. 

Elzinga  and  Moore  [2]  calculate  the  “ball  center”  of  the  actual  polytope.  This  is  defined  as  the 
center  of  the  largest  inscribed  sphere.  Formally,  for  a  polyhedron  [y  :  .4^y  <  c},  the  ball  center  is 
the  solution  of  the  following  LP  problem  at  iteration  k: 
cr‘‘  =  max{<T  :  aj y  +  ||o,j<T  <  c„  i  =  l,...,n}, 

Goffin  and  Vial  [4]  use  the  so-called  “analytic  center”  [17]  of  the  polytope.  The  analytic  center  is 
the  unique  maximizer  of  the  logarithmic  barrier  function: 
fnax{Er=ilog(c, -rtfy)  :  A^y<c), 

Let  p*  be  the  smallest  radius  of  the  largest  inscribed  ellipsoid,  centered  at  the  analytic  center  at 
iteration  k.  (It  is  known  [17]  that  there  exists  always  an  inscribed  ellipsoid  centered  in  the  analytic 
center,  and  p*  <  <7*  <  mp*  for  a  given  polyhedra.)  The  following  can  be  proved. 

Theorem  1  fij  For  the  sequence  of  the  generated  centers  {y*}]J_|  the  sequence  p*  converges  to 
zero. 

(it)  If  y  £  then  there  exists  a  p  >  0  such  that  y  is  p  feasible  for  any  constraint  cut. 
fiiij  //limt_ooP*  — “  0  and  z  <  z”  then  there  is  a  k  such  that  u*  6  .F  for  all  k  >  k. 

(iv)  If  z  <  z'  then  every  convergent  subsequence  converges  to  an  optimal  solution. 


3  A  Logarithmic  Barrier  Cutting  Plane  Method  for 
Convex  Programming 

We  remind  some  well-known  results  for  path-following  methods  for  LP  [16].  We  also  use  similar 
ideas  then  that  are  used  in  the  Build-Up  and  Down  algorithm  of  [9].  The  algorithm  and  its  analysis 
is  based  on  the  results  about  the  effect  of  adding  and  deleting  constraints  in  logarithmic  barrier 
methods,  studied  in  [6,  9]. 

Now  we  present  our  cutting  plane  algorithm  for  convex  programming  problems,  which  is  a  straight¬ 
forward  application  of  the  Build-Up  and  Down  Algorithm  of  [6).  Therefore  we  try  to  solve  subse¬ 
quent  LP  relaxations  of  (CP).  The  LP  problem  under  consideration  is  always  the  actual  relaxation 
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(localization)  of  (CP).  The  algorithm  starts  from  an  interior  feasible  solution.  Then,  as  in  path¬ 
following  methods  line  searches  are  performed  in  the  Newton  direction  of  the  logarithmic  barrier 
function  of  the  actual  LP  problem.  If  the  new  iterate  violates  a  constraint  or  is  “close”  to  its 
boundary,  then  we  stay  in  the  previous  point  and  a  new  supporting  hyperplane  is  added  to  improve 
the  approximation.  If  the  current  iterate  is  centered,  with  a  careful  deletion  rule  we  try  to  eliminate 
redundant  constraints  and  so  keep  the  size  of  the  LP  relaxation  as  small  as  it  is  possible. 

We  assume,  that  box  constraints  are  included  in  the  problem  and  index  set  J  refers  to  the  index 
set  of  the  box  constraints.  The  algorithm  goes  as  follows. 

A  Logarithmic  Barrier  Cutting  Plane  Algorithm 

Input; 

V  is  convex  polytop  (the  initial  LP  approximation)  such  that  T  CV\ 

/i  :=  po  is  a  barrier  parameter  value: 
ic  is  a  convergence  parameter; 

6  is  a  reduction  parameter.  0  <  (1  <  1; 

Q  is  an  initial  index  set  of  the  (linear^  constraints  of  P; 
y  6  is  a  given  interior  feasible  point  such  that  i(j(t/,p)  <  L 

begin 

while  p  >  2”''  do 
begin 

Delete-Constraints: 
p  :=  ( I  - 

Center-and- Add-Cut 
end 

end. 

_ Procedure  Delete-Constraints _ 

Input: 

f j  >  4  is  a  ’deleting'  parameter; 

begin 

for  i  ;=  1  to  n  do 

if  i  ^  Q  \  J  and  s,  >  tj  then 
begin 

IIpqII  >  i  then  Center-and-Add-Cut; 

end 

end. 

_ Procedure  Center-and-Add-Cut _ 

Input: 

ta  is  an  ’adding’  parameter; 
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begin 

while  6Q{y,fi)  >  j  do 
begin 

y-=y, 

a  :=  argmaxo>0  {/<}{y  +  «?<?,/*)  :  *.  -  oafpQ  >  0,Vi  €  q}; 
y:=y  +  apQ\ 

if3it  :  -fk(y)  <  2"'*  then  Add-Cut 
end 

y  ••=  y 

end. 


Procedure  Add— Cut 


Input; 

y  €  T  and  j/  ^  or  y  is  “close”  to  the  boundary  of 


begin 

If  y  ^  .F  then  y*"  is  the  boundary  point  of  T  on  the  line  segment  {y,y); 

If  y  €  F'  then  y*'  is  the  boundary  point  of  F  on  the  halfline  y  +  r>(y  -  y),  i?  >  0; 
/*  is  a  constraint  with  /k(y^)  -  0; 

“  ■"  II^A(>ill’  J  ~  IIvAcvMII  ’ 

P  ■=Pr\  {y  :  a~y  <  7} 

end. 


About  Convergence 

Up  to  now  we  could  not  get  a  rigorous  proof  for  convergence.  The  key  problem  is  how  to  prove 
that  an  inner  cycle  is  finite.  Namely,  if  we  are  centered,  then  the  barrier  parameter  is  reduced.  The 
problem  is  to  prove  that  after  a  finite  number  of  iterates  we  will  be  centered  again.  To  prove  this, 
probably  the  results  of  Den  Hertog,  Roos  and  Terlaky  (8,  9]  can  be  used. 

A  straightforward  convergence  proof  for  a  variant  of  the  above  algorithm  can  be  obtained  by  using 
the  results  about  the  discretization  of  semi-infinite  programming  problems  (see  e.g.  Gustafson  [5]). 

Comparing  with  Other  Cental  Cutting  Plane  Methods 

Kelley’s  [11]  cutting  plane  method  generates  generates  infeasible  points  and  it  is  known  to  be 
instabile,  inefficient  in  practice.  Blzinga  and  Moore’s  [2]  central  cutting  plane  method  eliminates 
these  disadvantages.  Centering  ensures  some  stability  and  at  least  some  of  the  iterates  are  feasible, 
therefore  the  algorithm  might  be  stopped  earlier  with  some  useful  information  in  hand.  Kortanek 
and  No  [14]  reports  some  encouraging  computational  results.  They  also  used  the  simplex  method 
to  solve  the  LP  relaxations.  The  central  cutting  plane  method  of  Coffin  and  Vial  [4]  has  the 
advantages  of  Elzinga  and  Moore’s  method.  They  calculate  the  analytic  center  of  the  polytop  with 
the  projective  algorithm.  The  analytic  center  has  some  nice  properties,  which  predicts  that  this 
method  might  be  even  more  efficient  than  the  cutting  plane  method  of  Elzinga  and  Moore.  Good 
computational  results  are  reported  in  [1|. 

Our  method  shares  the  advantages  of  the  above  central  cutting  plane  methods.  We  we  start  from 
a  feasible  point  and  feasibility  is  preserved  while  try  to  follow  the  central  path  of  the  actual  LP 
relaxation.  This  provides  the  centering  component  in  our  approach  and  provides  also  stability 
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as  above.  By  adding  new  cuts,  the  actual  center  moves  in  the  opposite  direction,  but  the  change 
cannot  be  arbitrarily  large  (see  [8,  9]).  The  same  holds  if  a  loose  cut  is  deleted.  The  main  difference 
comparing  with  the  other  approaches  is  that  in  the  other  central  cutting  plane  methods  the  LP 
relaxation  is  fixed  while  the  new  center  is  calculated.  We  dynamically  refine  the  LP  approximation 
of  (CP)  in  the  iterations  as  we  try  to  get  close  to  the  central  path. 

4  Computational  Experience 

The  implementation  of  the  Logarithmic  Barrier  Cutting  Plane  Algorithm  was  developed  to  run 
both  in  a  PC  environment  (using  Microsoft  FORTRAN)  and  in  a  mainframe  platform.  The  results 
presented  below  were  computed  using  FORTRAN  on  an  IBM  3090-200e  with  vector  processing. 
Similar  results  (in  terms  of  iteration  counts  and  solution  accuracies)  are  also  found  on  workstation 
or  by  the  PC  based  system. 

About  the  Test  Problems 

We  solved  19  test  problems.  The  first  14  problems,  taken  from  (14,  1],  are  examples  of  geometric 
programming  problems.  The  exponential  variable  transformation  (1,  =  e*'")  is  used  for  each  of  the 
problems  thereby  eliminating  the  need  for  explicitly  maintaining  the  positivity  constraint  U  >  0. 
Also  the  objective  functions  for  each  of  these  problems  has  been  made  linear.  The  final  5  problems 
are  examples  from  (13).  Problems  15,16,  and  18  are  geometric  programs;  problem  17  is  a  semi¬ 
infinite  program.  The  majority  of  the  problems  which  we  investigated  were  small  enough  to  allow 
an  initial  feasible  point  to  be  found  through  inspection.  Problems  10,  11,  12.  and  13  required  the 
use  of  phase  1. 

About  the  Implementation 

In  this  subsection  we  discuss  several  of  the  important  implementation  techniques  we  used  in  devel¬ 
oping  our  path-following  cutting  plane  system  for  solving  CP.  This  discussion  will  center  around 
the  main  activities  performed  by  the  system: 

•  generating  of  search  directions; 

•  linesearching  and  generating  cuts; 

•  determining  an  initial  interior  point; 

•  setting  the  required  parameters; 

•  terminating  the  algorithm. 

Computational  Results 

Although  the  results  presented  here  originate  from  our  system  running  on  the  IBM  3090-200e, 
similar  results  may  be  found  with  the  system  on  the  PC  and  workstation  environment. 

There  are  several  points  which  we  wish  to  highlight  from  Table  1.  First  there  seems  to  be  a 
remarkable  consistency  in  the  major  iteration  counts  listed.  This  consistency  is  a  direct  result  from 
the  strategy  we  use  to  reduce  the  log  barrier  parameter  p.  A  truer  measure  of  the  work  required  to 
solve  the  problem  may  be  found  in  the  number  of  normal  matrix  formulations  and  factorizations 
required  to  obtain  a  given  level  of  solution  accuracy.  It  is  apparent  that  this  measure  indicates  a 
wide  variance  in  the  difficulty  the  algorithm  had  in  solving  the  different  problems. 

Also  in  Table  1  we  see  that  duality  gaps  of  lOe-9  to  lOe-12  can  typically  be  achieved  with  this 
method.  Although  machine  accuracy  is  not  obtained,  the  method  presented  here  does  consistently 
and  significantly  outperform  the  method  of  [14j  and  our  results  are  also  better  than  presented  in  [1] 
in  terms  of  solution  accuracy.  Note  that  for  problem  13  our  method  had  difficulties  obtaining  the 
usual  level  of  precision;  a  gap  of  only  9.4e-7  was  possible.  In  this  problem,  there  was  an  enormous 
range  on  the  coefficients  which  exacerbated  the  numeric  difficulties. 
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Table  1:  Problem  Results  Overview 


Prob. 

No. 

Rows 

Col’s 

Iters. 

Max. 
Q  Size 
Factor. 

ier  of 
Matrix 

Average 

Recenter 

Sec. 

Duality 

Gap 

Total 

CPU 

1 

2 

1 

15 

13 

35 

2.7 

1.2e-12 

0.04 

4 

2 

-  25 

13 

51 

3.9 

1.8e-12 

0.06 

3 

1 

20 

14 

79 

5.6 

1.3e-ll 

0.10 

4 

1 

30 

12 

102 

8.5 

1.7e-10 

0.17 

11 

3 

60 

12 

120 

10.0 

5.7e-9 

0.35 

4 

3 

25 

12 

59 

4.9 

1.4e-9 

0.07 

8 

45 

15 

87 

5.8 

4.0e-12 

0.16 

8 

8 

45 

15 

113 

7.5 

3.8e-12 

0.21 

7 

■1 

56 

14 

190 

13.6 

4.7e-ll 

0.44 

7 

53 

17 

155 

9.1 

2.8e-9 

0.33 

7 

4 

45 

18 

153 

8.5 

4.5e-10 

0.32 

7 

4 

53 

20 

162 

8.1 

4.9e-ll 

0.36 

10 

7 

94 

15 

242 

16.1 

9.4e-7 

0.48 

14 

22 

36 

161 

13 

304 

23.4 

l.le-9 

1.78 

2 

2 

■a 

28 

2.2 

7.2e-12 

0.02 

2 

1 

■a 

18 

1.4 

1.7e-12 

0.02 

17 

2 

oo 

13 

31 

2.4 

l.Oe-12 

0.06 

18 

3 

15 

13 

34 

2.6 

6.1e-12 

0.03 

19 

3 

15 

14 

44 

3.1 

2.2e-12 

0.05 
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1  Introduction 

The  efficiency  of  a  barrier  method  for  solving  convex  programs  strongly  depends  on  the  prop¬ 
erties  of  the  barrier  function  used*  A  key  property  that  is  sufficient  to  prove  fast  convergence 
for  barrier  methods  is  the  property  of  self-concordance  introduced  in  (17].  This  condition  not 
only  allows  a  proof  of  polynomial  convergence,  but  numerical  experiments  in  (1,  11,  14]  and 
others  further  indicate  that  numerical  algorithms  based  on  self-concordant  barrier  functions 
arc  of  practical  interest  and  effectively  exploit  the  structure  of  the  underlying  problem. 

A  well-known  barrier  function  for  solving  convex  programs  is  the  logarithmic  barrier  function, 
introduced  by  fVisch  [4]  and  Fiacco  and  McCormick  [3].  To  describe  the  logarithmic  barrier 
function  more  precisely,  we  will  first  give  a  general  form  for  the  classes  of  problems  considered 
in  this  paper: 

fmin  /o(i) 

/i(i)  <0.  i  =  I.  -  •  ."i. 

i4i  =  b, 

where  A  is  p  x  n  matrix  and  b  an  p-dimcnsional  vector.  The  logarithmic  barrier  function  for 
this  program  is  given  by 

^(x,/i)  =  :^-f;in(-/.(x)), 

where  p  >  0  is  the  barrier  parameter.  We  show  that  for  several  classes  of  convex  problems  for 
which  interior-point  methods  were  presented  in  the  literature  the  logarithmic  barrier  function 
is  self-concordant.  These  classes  arc:  dual  geometric  programming,  (extended)  entropy 
|>rogramming,  primal  and  dual  /p-programming.  Since  for  dual  geometric  j^rogramining  and 
dual  /p-programming  no  complexity  results  are  known  in  the  literature,  these  self-concordance 
proofs  enlarge  the  class  of  problems  for  which  |)olynomiality  can  be  proved.  (In  [12]  only  a 
convergence  analysis  is  given.)  Moreover,  we  show  that  some  other  smoothness  conditions 
used  in  the  literature  (relative  Lipschitz  condition  [9,  7],  scaled  Li|)schitz  condition  [25,  13], 
Montciro  and  Adler’s  condition  [16])  are  also  covered  by  this  self-concordance  condition. 

'Tli«  fourth  author  is  on  leave  from  the  E^olvds  University,  Budapest.  Ilcscarch  partially  supported  by 
OTKANo.  2116. 
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These  observations  allow  a  unification  of  the  analyses  of  interior-point  methods  for  a  number 
of  convex  problems. 

The  article  is  divided  in  three  parts.  In  Section  2  we  give  the  definition  of  self-concordance 
and  state  some  basic  lemmas  about  self-concordant  functions.  In  Sections  3  -  6  we  prove 
self-concordance  for  the  classes  of  problems  treated  in  [5,  12,  23],  and  in  Section  7  we  show 
that  the  smoothness  conditions  used  in'(7,  9,  13,  16,  25]  imply  self-concordance  of  the  barrier 
function. 


2  Some  general  composition  rules 

Let  us  first  give  the  precise  definition  of  self-concordance  as  given  by  Nesterov  and  Ne- 
mirovsky  [17]: 

Definition  of  self-concordance:  Let  be  an  open  convex  subset  of  J7Z".  A  function 
V?  :  .7^  — *  JZ  is  called  /i-self-concordant  on  >  0,  if  is  three  times  continuously 

difTcrcntiablc  in  and  if  for  ail  x  6  ^  and  /i  g  /fZ"  the  following  inequality  holds: 

VV(a:)l/*,/‘,/t]  <  2k  , 

where  V^(/j(i)(/i,  h,  A]  denotes  the  third  differential  of  v?  at  i  and  /i. 

Intuitively,  since  V^ip  describes  the  change  in  V^ip,  and  since  is  bounded  by  a  suitable 
power  of  this  condition  implies  that  the  relative  change  of  is  bounded  by  2k. 
The  associated  norm  to  measure  the  relative  change  is  given  by  i.e.  for  li  6  //Z" 

the  norm  associated  with  the  point  x  is  i|/*i|vV(r)  •=  (See  [10]  and  |C)  for 

example,  where  also  a  brief  analysis  is  given,  showing  that  the  property  of  self-concordance 
of  the  barrier  function  of  a  convex  program  is  sufTicient  to  prove  polynomial  convergence.  A 
more  detailed  analysis  that  includes  certain  nonconvex  |>rugrams  and  that  uses  an  additional 
condition  relating  the  first  and  second  derivatives  of  y?  is  given  in  [17].) 

The  following  lemma  gives  some  helpful  composition  rules  for  self-concordant  functions.  It 
follows  immediately  from  the  definition  of  self-concordance. 

Lemma  1  (Nesterov  and  Ncmiivvsky  [17]) 

•  (addition  and  scaling)  Let  ipi  be  Ki-sclf-concoi'dant  on  i  =  1,2,  and  pj.pj  €  HI  then 
Pi^Pi  +  is  K-sclf-concotrlant  on  D  where  k  =  mai{^, 

•  (ajjinc  invariance)  Let  be  K-sclf-concordant  on  and  let  B{x)  =  Bx  +  b  :  — »  Bi" 

be  an  affine  mapping  such  that  B{Bl^)r\J^  Then  <p(B(.))  is  K-sclf-concordant  on 

{x  :  Bix)€J^}. 

The  next  lemma  gives  a  sufficient  condition  for  an  objective  function  /  to  guarantee  that 
/  “combined”  with  the  logarithmic  barrier  function  for  the  positive  orlhanl  of  Bi"  is  self- 
concordant. 


Lemma  2  Let  f(x)  G  be  convex.  If  then  exists  a  0  such  that 
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Vi  e  ^  and'^h  G  nr,  then 

^(x)  :=  /(i)  -  111  I, 

i=l 

is  (1  +  ^P)-sclf-concordanl  on  ,  and 

0(i/,  i)  :=  -  lii(i/  -  f{x))  -  X] 

t=l 

is  (1  +  ^0)-sclf-concordanl  on  .  Here,  ^  Q  RlxJ^  is  the  set  {(i/, i)  |  i  G  v  >  /(i)}- 

Sonic  discussion  of  property  (1)  may  be  useful.  Let  i^(i)  -  -  liii,  be  tlic  logaritliinic 
barrier  for  111".  Observe  tliat 

We  recall  that  (as  mentioned  above)  tlic  canonical  norm  associated  with  some  barrier  function 
^  at  a  point  i  is  given  by  V^<^(i).  Loosely  speaking,  properly  (1)  tells  us  that  for  ||/r((v»^(x)  = 
1,  the  spectral  norm  of  the  third  derivative  V'^/  is  bounded  by  a  multiple  0  of  the  spectral 
norm  of  the  second  derivative  V*/.  7'liis  properly  is  defined  in  j  17)  as  /  being  compatible 
with  <i>,  and,  as  we  have  seen,  it  implies  self-concordance  of  the  combined  barrier  functions 
(/>  and 

Clearly,  if  /  satisfies  (1),  then  so  docs  \  J  for  any  (fixed)  i)araiii<;ler  /«  >  0.  In  particular,  this 
implies  that  also  the  function  f(x)/n  -  is  (1  +  ^/y)-self-concordant.  Finally  we  note 

that  for  any  parameter  </  >  I ,  the  above  proof  also  holds  true  for  — r/  ln(;/  —  f{x))  —  ^"_i  hi  i^. 
This  observation  can  be  used  to  prove  that  for  the  classes  of  probhuiis  considered  in  this  paper 
not  only  the  logarithmic  barrier  fiinclion  but  also  the  renter  fnin  lion  of  lliiard  (8)  (also  u.sird 
in  c.g.  (21,  1),  10,  6))  is  sclf-concorrlaiit. 

3  The  dual  geometric  programming  problem 

Let  (lk)k=i,  ,T  be  a  partition  of  (i, •••,»»}  (i.e.  U][._,/r  =  {l,  -,7r}  and  4  D //  =p  for 

k  ^  /).  'I'lic  dual  geometric  programming  problem  [2]  is  then  given  by 

min  c'x  +  .r,  Ini,  -  ■^>)1 

(VQV)  J\x  =  b 

i  >  0, 

where  A  is  an  tn  x  n  matrix  and  c  and  h  are  n-  and  m-dimciisional  vectors,  rcsp<;ctivcly.  I'or 
this  problem  we  can  prove  the  following  lemma. 

Lemma  3  The  logarithmic  barrier  function  of  the  dual  geometric,  piogramming  jnvblcm 
{VQV)  is  2-sclf-concoulanr . 

^Tliis  corrects  a  remark  in  [12],  in  which  it  is  claimed  that  the  .self-concordance  properly  does  not  hul<l 
for  this  problem. 
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4  The  extended  entropy  programming  problem 

The  extended  entropy  programming  problem  is  defined  as 

r  min  c^'x  +  53"^,  <7,(1,) 

i££V)  lAx  =  h 

{  X  >  0, 

where  A  is  an  in  x  u  matrix  and  c  and  b  arc  n-  ami  rn-dimcnsional  vectors,  respectively. 
Moreover,  it  is  assumed  that  the  scalar  functions  g,  G  satisfy  \gi"{xi)\  <  i  = 

This  class  of  problems  is  studied  in  Ye  and  Potra  [23]  and  Han  ct  aP  [5].  In  the 
ease  of  entropy  programming  we  have  g,(x,)  =  x.  In  i,,  for  all  i,  and  x,  =  1.  Self-concordance 
for  the  logarithmic  barrier  function  of  this  problem  simply  follows  from  the  following  lemma. 

Lernma  4  Suppose  that  |<7|”(.r,)|  <  i  =  I,--  -  ,1/,  then  the  logarithmic  baii-icr  func¬ 

tion  for  the  extended  entropy  pwgrnmmtng  problem  (SCV)  is  ( 1  +  ^  max,  ii,)-sclf-concoi'ilanl. 


5  The  primal  -programming  problem 


Let  {li,}k=i,  ,T  be  a  partition  of  {l,-  -,m}  (i.e.  VI. Jk  =  (L'  •.»'»}  ‘^“d  Ik^  h  for 
k  ^  1).  Let  p,  >1,1  =  'fhen  the  primal  /,,-i)rogi amming  prolrlem  [18,  22]  can  b<r 

formulated  as 


{VC,,) 


max  i\  ^x 

Toiti.  -  <•1'’'  +  >>kX  -  dk  <0,  k  =  -  •  ,7-, 


where  (for  all  1  and  k)  fi*,  and  7/  arc  n-<limen.sioiinl  vrxtors,  and  c,  and  dk  arc  real  numbers. 
Nesterov  and  Nemirovsky  [17]  treated  a  special  case  of  this  problem,  naimdy  the  .so-called  I,,- 
approximation  problem.  We  will  rcformnlate  {VC,,)  such  that  all  pioblem  functions  remain 
convex,  contrary  to  Nesterov  ami  Nemirovsky’s  ii'forinulatioii. 

In  a  first  step,  the  primal  /p- programming  problem  can  be  reformuiated  as: 


max  7/^x 

+  l>[x  -  <4  <  0,  k  =  1 ,  ■  •  • , 7- 

'^r.  ^  1 

af  X  —  c,  <  s,  >  i  =  1 ,  •  •  •  ,  7/t 
-afx  +  c.  <  .s,  j 
s  >  0. 


(2) 


In  the  same  way  as  we  will  prove  Lemma  H,  it  can  be  proved  that  the  logarithmic  barrier 
function  for  this  reformuhated  /,, -programming  problem  is  ( 1  -f  ^  max,  Ip^  — 2|)-sclf-concordant, 
i.e.  the  concordance  parameter  ilepends  on  />,.  We  can  eliminate  this  dependence  as  follows. 
Ucplace  the  constraints  . s’, ’■  <  /.,  by  the  C(|uival<’nt  constraints  .s,  <  f,"',  where  0  <  Ji”,  ;=  jJ-  <  1, 
and  re[)lace  the  (redundant)  constraints  .s  >  0  by  I  >  0.  So,  we  obtain  the  following 
reformulated  /,,-|)rogramming  problem; 

^In  |5]  it  is  conjectured  that  these  problems  do  not  $.-itisfy  the  self-concordance  condition.  The  lemma 
shows  that  this  conjecture  is  not  true. 
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(p-c;) 


max  rj^x 

j-Ji  +  bjx  -  dk  <0,  k=l,--,r 

Si  <  tV  ] 

aj X  —  Ci  <  Si  >  t  =  1 ,  ■  •  • ,  m 

-aTx  +  Ci<Si  J 

i  >  0. 


(3) 


Observe  that  the  transformed  problem  has  4m+r  constraints,  compared  with  r  in  the  original 
problem  {VCj,).  Now  we  have  the  following  lemma. 


Lemma  5  The  logarithmic  barrier  function  for  the  refoimulaled  Ip-programming  problem 
{VCp)  is  ^-self-concordant. 


6  The  dual  -programming  problem 


Let  o,  be  such  that  —  -f  —  =  1,1  <  i  <  m,  and  let  the  rows  of  a  matrix  /I  be  a;,  t  =  1 ,  ■  •  ■ ,  m, 
and  the  rows  of  a  matrix  13  be  in.,  k  =  1 ,  •  •  • ,  r.  Then,  the  dual  of  tlic  /^-programming  problem 
{VCp)  is  (sec  (18l-[22]) 


(-DCp) 


min  c^j/  -1-  -1-  Zk  Zici„  ^1^1' 

A^y  D^z  =  */ 

r  >  0. 


(If  Pi  0  and  Zk  =  0,  then  Zklyi/zkl'''  is  defined  as  oo.)  The  above  problem  is  equivalent  to 

min  c^y  -f  d^z  +  537^,  Mi 
sVzk"'*'  <ti,  i:=l,---,r 

y  <  s 

'  -y  <s  (4) 

A^y  z  =  Tj 

z  >  0 

s  >  0. 


It  can  be  proved  that  the  logarithmic  barrier  function  of  this  reformulated  dual  /p-programming 
problem  is  (1  -I-  ^  inax.ly,  -1-  l))-sclf-concordant.  Again,  the  dependence  on  y,  can  be  elimi¬ 
nated:  the  constraints  sl'z^^'**  <  /,  arc  replaced  by  the  equivalent  constraints  /f’z^ >  s^, 
where  0  <  p,  :=  ^  <  1,  and  the  redundant  constraints  s  >  0  arc  replaced  by  /  >  0.  The 
new  reformulated  dual  /p-programming  problem  becomes: 


(-DC'p) 


min  c’^y  +  d^z  +  E?=i  Mf 
Si  <ti’Zk'’'*'y  »  G /fc,  Jt  = 
y  <  s 

{ 

A^y  +  z  =  1/ 


z>0 

I  >  0. 


1,' 


(5) 


Note  that  the  original  problem  {VCp)  has  r  inequalities,  and  the  reformulated  problem  {'DC'p) 
4m  +  r.  We  now  have  the  following  lemma. 


279 


Lemma  6  The  logarithmic  barrier  function  of  the  irformuiated  dual  lj,-])rogramming  problem 
ifDC'^)  is  2-self-concordanl. 


7  Other  smoothness  conditions 

Relative  Lipschitz  condition 

Jarre  {9j  introduced  the  following  relative  Lipschitz  condition  (also  used  in  c.g.  (7])  for  the 
Hessian  matrix  of  the  problem  functions  f,{x),  0  <  i  <  in,  of  (CP)' 

3M  >Q:  Vu  6  i?e”  Vx,i  +  /i  €  ^  : 

|n^(VV.(x  +  h)  -  VV.(x))t>|  <  M||h!|/,n^VV.(x)r,  (6) 

where  H  is  the  Hessian  matrix  of  the  corresponding  logarithmic  barrier  function.  As  shown 
in  Jarre  [10],  if  the  Hessians  of  the  problem  functions  /,  of  (V)  fulfil  this  relative  Lipscliitz 
condition  witli  i)aramcter  A/,  and  if  /,  €  CP,  then  the  associated  logarithmic  barrier  function 
is  (1  +  A/)-self-concordant.  (The  converse  is  not  true.)  Moreover,  in  [10]  it  is  shown  that 
the  relative  Ivipschitz  condition  for  the  logarithmic  bairiei  function  is  equivalent  to  self- 
concordance  if  the  underlying  function  is  tlirec  limes  continously  difTcrenliable. 

Monteiro  and  Adler’s  condition 

Montciro  and  Adler  [16]  considered  minimization  problems  with  linear  equality  constraints 
and  a  separable  convex  objective  function  on  llie  positive  orthant  of  iR”.  The  objective 
function  /(x)  =  must  satisfy  the  following  condition: 

There  exist  positive  numbers  T  and  p  such  that  for  all  reals  x  >  0  and  y  >  0  and  all 
t  =  1,  •  ■ . ,  n,  we  have 

ylfl-rC!/)!  <  ^ max|(^)'’,(^)'’|</,"(x). 

Using  Lemma  2  and  substituting  y  =  x  in  the  above  condition,  it  is  easy  to  sec  tlial  g, 
satisfies  (1)  with  P  =  T,  i.e.  that  the  iogaritlimic  barrier  function  for  such  a  problem  is 
(1  -f-  57’)-sclf-concor<lant.  Using  Lemma  2  we  may  simplify  tlie  condition  of  [16]  to  llie 
(weaker)  condition  that  tlicrc  exists  a  positive  number  T  sucii  that  for  all  reals  y  >  0  and 
all  1  =  1,  •  ■  • ,  n,  we  have 

yWny)\  <  rg'H^)- 

This  condition  is  not  only  simpler,  also  the  dependence  on  some  extra  parameter  p  is  elimi¬ 
nated. 

Scaled  Lipschitz  condition 

In  (25)  and  [13]  interior-point  methods  arc  given  and  aniviyzed  for  problems  with  linear 
equality  constraints  and  convex  objective  function  f(x)  on  the  positive  orthant  of  IH".  The 
objective  function  has  to  satisfy  the  following  scaled  Lipschitz  condition: 

There  exists  M  >  Q,  such  that  for  any  7,  0  <  7  <  1, 

|lA'(V/(i  +  Ax)  -  V/(i)  -  VV(x)Ax)ll  <  A/Ax'  VV(x)Ax,  (7) 

whenever  x  >  0  and  ||A'“'Ax||  <  7.  (Here,  ||.||  is  the  Euclidean  norm.) 

This  condition  is  also  covered  by  the  self-concordance  condition  if  /  is  three  times  contin¬ 
uously  differentiable  in  the  interior  of  the  feasible  domain.  More  precisely,  the  next  lemma 
states  that  the  corresponding  logarithmic  barrier  function  is  (1  -H  5  A/)-self-concordant. 
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Lemma  7  Suppose  /(i)  €  fulfils  the  sealed  Lipschilz  condition  wtlk  pammcler  M .  Then 
the  logarithmic  barrier  fimetions  tp  and  tl>  finm  Lemma  2  air  (1  +  ^M)-self-concoidant. 

Before  we  conclude  we  would  like  to  briedy  point  out  a  class  of  problems  considered  by 
Mchrotra  and  Sun  (15]  (and  also  by  Zhang  (2-1))  which  docs  not  have  a  self-concordant 
logarithmic  barrier  function.  M<4»rotra  aiul  Snn  introduced  a  curvature  constraint  of  the 
following  form:  'I'lierc  exists  a  number  k  >  1  such  that  for  all  x,  y  and  h  in  Bl” 

<  xlJv^f,(y)h. 

For  constraint  functions  /,  satisfying  this  condition,  they  present  a  polynomial  time  interior- 
point  algorithm  (that  needs  at  most  (9(/c*ymlnc)  Newton  iterations  to  reduce  the  error  by 
a  factor  of  t).  Clearly,  there  arc  constraints  with  self-concordant  barriers  that  do  not  satisfy 
this  condition,  and  conversely,  this  condition  covers  some  constraint  functions  that  do  not 
have  a  sclf-concordaiit  barrier  function.  I'or  most  applications  however,  we  believe  that  the 
.self-concordance  coiulition  is  more  practical. 
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Abstract:  This  paper  gives  an  overview  of  different  modeling  techniques  in  the  realm 
of  integer  and  mixed  integer  programming  as  well  as  in  logical  modeling.  The  modeling 
language  LPL  (Linear  Programming  Language)  is  use  as  vehicle  to  express  such 
models.  The  idea  of  having  a  unique  representation  scheme  for  mathematical  and  logical 
models  is  powerful  but  its  advantages  are  not  yet  widely  recognized. 

Initially,  LPL  was  built  to  formulate  the  structure  of  bigger  LP  models  to  overcome  the 
model  management  difficulties  of  big  real-live  LP  models.  In  the  meantime,  the 
language  specification  has  been  extended  several  times  to  manage  more  complex 
models  such  as  logical  models. 

IP  and  MIP  techniques  are  applicable  to  a  surprisingly  large  number  of  (logical) 
problems  too.  Methods  are  given  to  convert  these  models  to  pure  MIP  models.  The 
LPL  compiler  translates  pure  logical  sentences  into  IP  restrictions  such  that  a  linear  MIP 
solver  can  solve  them. 
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1  INTRODUCTION 


"One  of  the  reasons  why  IP  has  not  been  applied  anywhere 
near  as  widely  as  it  might  to  practical  situations  is  the  failure 
to  recognize  when  a  problem  can  be  cast  in  this  mould." 

Williams  H.P. 


This  paper  supposes  a  reader  familiar  with  the  basics  of  the  LPL  modeling  language 
which  allows  a  modeler  to  formulate  an  LP-model  in  the  usual  mathematical  notation 
using  indexing  mechanism  as  described  in  [Hiirlimann  1992].  A  brief  overview  of  LPL 
can  also  be  found  in  [Hiirlimann  1992a]. 

To  summarize,  the  main  features  of  LPL  (Linear  Programming  Language)  are: 

•  a  simple  syntax  of  models  with  indexed  expressions  close  to  the  mathematical 
notation,  and  directly  applicable  for  documentation 

•  formulation  of  both  small  and  large  LFs  with  optional  separation  of  the  data  from 
the  model  struaure 

•  availability  of  a  powerful  index  mechanism,  making  model  structuring  very  flexible 

•  an  innovative  and  high-level  Input  and  Report  Generator 

•  intermediate  indexed  expression  evaluation  (much  like  matrix  manipulation) 

•  automatic  or  user-controlled  prodiK;tion  of  row-  and  column-names 

•  tools  for  debugging  the  model  (e.g.  explicit  equation  listing) 

•  built-in  text  editor  to  enter  the  LPL  model 

•  fast  production  of  the  MPS  file 

•  open  interface  to  most  LP/MIP  solver  packages. 


Recently,  LPL  has  been  enhanced  by  several  logical  operators  to  exploit  the  power  of 
integer  programming.  This  extension  of  the  new  version  3.9  of  LPL  is  summarize  in 
the  second  part  of  this  abstract. 

It  is  well  known  that  logical  statements  can  be  translated  into  IP-constraints  containing 
0-1  variables.  The  paper  hold  at  the  conference  will  present  the  translation  rules  of 
logical  statement  into  IP  constraints  used  by  the  LPL  compiler  and  expose  a  number  of 
applications. 
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A  surprisingly  wide  class  of  practical  problems  can  be  modeled  using  integer 
programming.  There  are  applications  in  OR/MS  including  operational  problems  such  as 
distribution  of  goods,  production  scheduling,  machine  sequencing;  planning  problems 
such  as  capital  budgeting,  facility  location,  portfolio  analysis;  design  problems  such  as 
communication  and  transportatibn  networks  design,  VLSI  circuit  design.  There  are  also 
many  applications  in  combinatorics,  graph  theory  and  logic.  An  even  broader  model 
class  are  mathematical  models  amalgamated  'Adth  some  logical  condidons. 

But  the  MIP  formulauon  of  a  problem  is  sometimes  far  from  being  trivial.  Ingenious 
techniques  and  a  lot  of  modeling  experience  is  needed  to  formulate  such  problems. 
Often,  a  more  natural  formulation  and  representation  for  many  problems  in  the 
mentioned  domains  is  (a  subset  oO  predicate  logic.  An  extension  of  the  LPL  modeling 
language  has  been  designed  recently  by  introducing  various  logical  operators  to  use 
more  natural  formulation  techniques.  This  allows  the  modeler  to  formulate  his  problem 
in  a  subset  of  predicate  logic.  By  default,  the  LPL  compiler  translates  such 
representation  into  a  mixed-integer  linear  mathemadcal  formuladon  in  order  to  apply  a 
general  MIP  solver. 

Several  modeling  techniques  in  integer  programming  are  investigated  in  this  paper. 
Formuladon  methods  are  given  for  different  problems,  which  can  be  expressed  as  MEP 
problems. 

It  is  well  known  [Williams  1977],  that  Boolean  expressions  can  be  translated  into 
linear,  mathematical  constraints  such  that  -  loosely  speaking  -  the  solution  space  is  the 
same.  Suppose  as  an  example,  a  mathematical  model  containing  different  linear 
restriedons  is  given  and  the  modeler  wants  -  among  other  constraints  -  to  add  the  logical 
constraint  'X  or  Y'  where  X  and  Y  are  two  propositional  statements.  Adding 
X  or  Y'  means  that  the  model  is  no  longer  a  pure  mathematical  model,  but  a  mixed 
model  of  mathematical  and  logical  constraints.  To  mould  the  whole  model  into  a  pure 
mathematical  form,  the  logical  statement  'X  or  Y'  must  be  replaced  by  the  linear 
constraint  'x+y^l',  where  x  and  y  are  0-1  variables  with  the  meaning  that  they  are  1  if 
the  corresponding  proposition  is  true  and  0  otherwise.  It  is  not  difficult  to  see  that  the 
constraint  'x+y2r  holds  if  and  only  ’X  or  Y'  is  true. 

Although  there  are  different  methods  to  translate  logical  statements  into  0-1  constraints, 
an  mechanical  transladon  procedure  is  useful,  since  it  will  allow  to  apply  a  professional 
MlP-solver  to  solve  such  mixed  problems.  Furthermore,  the  translation  step  -  coded  by 
hand  -  would  be  very  time  consuming  and  prone  to  errors.  Hence,  an  automated 
translation  procedure  is  especially  interesting  for  mixed  models  containing  symbolic 
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and  quantitative  knowledge.  But  even  pure  logical  models  such  as  the  SAT-problem 
(satisfiability  problem)  can  be  translated  into  an  IP  model.  Since  these  problems  are  all 
at  least  NP-hard,  it  is  advantageous  to  approach  their  solutions  using  different  methods. 
Furthermore,  an  important  subset  of  logical  problems  that  can  be  formulated  as  Horn 
clauses  is  not  NP-hard.  Horn  clauses  translated  into  IP  programs  can  be  solved  using 
an  LP  solver,  since  it  is  sufficient  to  solve  the  LP  relaxation  of  the  IP  program.  The 
integrality  constraints  are  automatically  fulfilled.  This  is  an  interesting  aspect  of  many 
(Hom)-rule  based  knowledge  bases:  Instead  of  using  inference  and  resolution 
techniques  to  solve  such  problems,  one  may  translate  the  problem  into  a  LP  problem 
and  solve  the  transformed  problem;  large  LP  problems  can  be  solved  quickly. 

It  is  relatively  new  to  integrate  a  mechanical  translation  procedure  into  a  modeling 
system.  McKinnon  &  Williams  (1990]  presented  a  procedure  and  its  implementadon  in 
Prolog,  which  accepts  logical  statements  and  outputs  the  corresponding  linear 
constraints.  Lucas  &  Mitra  &  Moody  (1992]  also  expose  the  specification  of  such  a 
converter  procedure  which  will  be  integrated  into  the  CAMPS  modeling  system  [Lucas 
&  Mitra  1988].  The  present  paper  exposes  another  translation  procedure  which  is 
already  included  into  the  LPL  language. 

Although  such  a  general  converter  is  beneficial  for  many  problems,  one  should  not 
imply  that  every  logical  model  should  be  translated  into  an  IP  model  to  solve  it  with  a 
general  IP  solver.  Logical  models  are  habitually  solved  more  efficiently  by  specialized 
and  more  appropriate  solvers.  One  should  here  clearly  separate  the  formulation  and  the 
solution  process.  LPL  is  a  language  which  allows  the  modeler  to  formulate  his  model, 
but  the  language  has  no  general  mechanism  to  solve  the  problem.  Since  it  is  generally 
admitted  that  a  general  solver  for  mathematical  and  logical  models  will  never  show  up, 
and  since  many  specialized  and  efficient  solvers  exist  for  different  subset  of  problems, 
the  modvation  is  to  have  at  least  a  unique  modeling  language  framework  with  which  all 
kind  of  models  might  be  formulated  and  which  is  formalized  enough  to  be  processed 
automatically  by  a  computer.  If  there  is  no  hope  for  a  unique  universal  solver,  it  might 
at  least  be  possible  to  have  a  unique  language  of  formulation. 

2  LPL  EXTENSION  FOR  LOGICAL  MODELING 

Mathematical  models  can  be  represented  by  the  LPL  language  in  a  form  close  to 
algebraic,  indexed  notation. 

Example:  the  constraint 

for  every  i  €  T 

would  be  formulated  using  the  LPL  syntax  as 
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EQUATION  R{ilT)  :  alii  <-  SUH{j)  b(i.j)  ♦  SUM(k,llS  and  k>l  and  k<nl  c I i. k-1, 1 ) ‘x |k, 1 1 

The  statement  contains  data  tables  such  as  Ciu  or  ai,  an  indexed  variables  xu,  several 
indices  (ij,...),  arithmetic  and  logical  operators.  Several  logical  operators  have  been 
incorporated  into  the  older  versions  of  the  LPL  language  to  evaluate  Boolean 
expressions.  But  Boolean  expressions  were  not  allowed  in  expressions  containing 
variables  as  operands.  A  Boolean  expression,  such  as  'S  and  k>l  and  k<n'  in  the  last 
example,  can  be  evaluated  immediately,  since  the  operands  are  all  'known'  quanddes. 

But  consider  now  the  foUowihg  expression,  where  x  is  an  unknown  quandty 
((x<2)or(x^5))  and  (x^O) 

This  Boolean  expression  contains  a  variable  x  and  can,  therefore,  not  be  evaluated, 
since  the  value  of  x  is  unknown.  If  x  is  between  zero  and  two  or  greater  than  5,  the 
expression  returns  true  otherwise  it  returns  false  (Figure  1). 

8  2  5 

Figure  1 

The  second  Boolean  expression  must  be  approached  differently  than  the  first 
expression.  Nevenheless,  a  modeling  language  should  accept  both  expressions, 
independently  of  how  they  are  processed.  In  LPL,  Boolean  expression  such  as 
((x  S  2)  or  (x  S  5))  and  (x  S  0)  can  be  written  in  a  straightforward  way  and  can  be 
integrated  into  the  model  as  model  constraint  such  as 

EQUATION  MyConstraint :  ((x<“2)  or  (x>-5()  and  (x>-0t  ; 

To  process  such  a  constraint  using  a  standard  solver ,  it  must  be  translated  into  a  pure 
logical  or  into  a  pure  mathemadcal  statement.  Since  LPL  interacts  well  with  an  LP/MIP- 
solver,  the  compiler  translates  them  by  default  into  linear  constraints,  in  order  to  ^)ply 
the  LP/MIP-solver.  But  again,  there  might  be  a  more  efficient  solver  for  the  problem  at 
hand,  and  in  this  case  the  modeling  system  should  translate  the  formuladon  into  the 
appropriate  form. 

Table  1  summarizes  all  logical  operators  which  are  defined  in  LPL  and  which  can  be 
used  in  the  formulation  of  a  constraint.  Of  course,  ail  operator  can  also  be  used  in 
Boolean  expression  which  are  evaluated  immediately  as  in 

VARXd.j:  ATLEASTO)  (il  ali.jl); 
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The  declaration  is  perfectly  correct.  It  means  that  a  variable  X  is  declared  for  every  (i  J)- 
tuple,  such  that  at  least  three  of  a  row  i  in  the  (known)  data  matrix  aij  are  different  from 
zero. 

Note  also,  that  the  operators  AND,  OR,  XOR,  NOR,  and  NAND  can  be  used  as  binary 
operators  as  well  as  index-operator^  As  an  example,  AND{i}  a[i]  simply  means  a[I] 
and  a(2]  and  ...  and  alnj;  furthermore,  x  AND  y  can  also  be  written  as  AND(x.y).  For 
a  detailed  syntax  see  the  Reference  Manual  [Hurlimann  1992b].  It  should  also  be  noted 
that  the  AND{}  has  the  same  meaning  as  the  old  FORALLQ  and  the  OR{}  is  the  same 
isEXISTO. 

EXACTLY.  ATLEAST,  and  ATMOST  are  index-operators  with  a  slightly  different 
syntax.  The  reserved  word  is  followed  by  an  expression  surrounded  by  parentheses. 
The  expression 

ATHOST  (4)  (i)  a|l|; 

means  that  'at  most  4  out  of  all  a(i]  should  be  mie  (=non-zero)'.  Several  applicadons 
that  use  these  operators  will  be  shown  in  this  paper. 


Operator 

Alternative  formulation  Interpretation  | 

(x  and  y  are 

any  sub-expression  containlnq  variables) 

unary  oparatorc 

NOT  X 

X  is  false 

binary  oparators 

X  AND  y 

ATLeAST(2)  (Xey) 
AND(x,y) 

both  (X  and  y)  are  true 

X  OR  y 

ATLEAST (1)  (x.y) 

OR(x, yl 

at  least  one  of  x  or  y  is  true 

X  XOR  y 

EXACTLY  (U  (x,y) 
XOR(x,y) 

exactly  one  is  true  (either  or) 

(X  OR  y)  AND  (NOT  X  OR 

NOT  Y) 

X  IHFL  y 

NOT  X  OR  y 

X  implies  y  (implication) 

X  IFF  y 

(x  IMPL  y»  AND  (y  IMPL 
NOT  (X  XOR  y) 

y)  X  if  and  only  if  y  (equivalence) 

X  NOR  y 

NOT  (X  OR  y) 

NOT  X  AND  NOT  y 

none  of  x  and  y  is  true 

ATMOST (01  (x.y) 

(at  most  none  is  true) 

X  NANO  y 

NOT  (X  AND  y) 

NOT  X  OR  NOT  y 

at  most  one  is  true 

ATMOST (1)  (x.y) 

(at  least  one  is  false) 

indaxad  oparatora 

rORALL(l)  x(l 

1  AND(i)  x(ll 

all  xlll  are  true 

ANOIl)  xUI 

ATLEASTdil  (l:  x(i| 

all  x|l|  are  true 

EXZST(l)  Kill 

0R(1)  x(l| 

at  least  one  out  of  all  xll)  Is  true 

OR(l)  x|l) 

ATLEAST(lMl)  Xlil 

at  least  one  out  of  all  x(l]  Is  true 

XORdI  x(l] 

EXACTLYdl  (1)  x(l| 

exactly  one  out  of  all  x|l|  Is  true 

NOR(i)  x|i| 

ATM0ST(0)(1|  x(i| 

none  of  all  xlll  is  true 

NANDU)  x(i| 

ATMOST(ll-l) (i)  x|i) 

at  least  one  of  x|l|  Is  false 

ATLEAST (k) (1) 

xlll 

at  least  k  out  of  all  x|l|  are  true 

ATMOST(k) (11 

Xlll 

at  must  k  out  of  all  x|l)  are  true 

EXACTLY (k) (1) 

xlll 

exactly  k  out  of  all  xlll  are  true 

Table  1:  logical  operators  in  LPL 
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LPL  allows  also  to  intioduce  predicate  variables.  They  are  sin:q)ly  declared  as  variables 
of  type  LOGICAL  such  as 

VAR  HyPredlcateU)  LOGICAL; 

To  link  the  predicate  with  the  rest  of  the  otherwise  mathematical  model,  an  expression 
can  be  attached  separated  by  ajissign  operator.  Suppose  that  a  predicate  P  is  introduced 
into  the  model  with  the  meaning  that  it  is  tnie,  if  another  (real)  variable  x  is  strictly 
between  the  lower  (/)  and  upper  (u)  bounds.  The  following  declaration  introduces  the 
predicate  and  the  real  variable  and  links  the  predicate  to  the  variable. 

VAR  X  (l.ul;  'quantity  x  of  product  1  produced  with  lower  and  upper  bound' 

VAR  P  LOGICAL  :  x;  'product  i  is  manufactured  (true  or  false!)' 

Using  this  declaration,  one  can  express  the  logical  condition  P  and  (x  ^  u), 

which  means  that  P  is  true  if  and  only  if  x  is  between  the  lower  and  upper  bound.  If  the 
lower  bound  for  x  is  not  declared,  the  declaration  of  P  expresses  the  condition 
x>0-*  P  (which  is  the  same  as  x  =  0);  if  the  upper  bound  of  x  is  not  declared, 
it  expresses  the  condition  P  x>0;  and  if  no  bound  for  x  was  declared,  an  error  is 
generated  by  the  LPL  compiler. 

It  is  also  possible  to  link  a  predicate  to  any  mathematical  expression,  such  as 

VAR  X  Ilx,  uxl;  ydy.uyl; 

VAR  0  LOGICAL  :  (x>a)  or  (y<b); 

The  declaration  of  Q  imposes  the  logical  condition  Q-*((x>  a)  or  (y  <  b)).  On  the 
other  hand,  if  the  modeler  wants  to  impose  the  condition  ((x  >  a)  or  (y  <  b))  -*  Q,  the 
expression  must  be  preceded  by  the  symbol  ’<’  as  in 

VAR  0  LOGICAL  ;<  (x>a)  or  (y<b) ; 

More  explicit  examples  will  be  shown  in  the  pi^r. 
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Interior  Cutting  Plane  Methods 

The  huge  amount  of  research  on  interior  point  methods  which  has 
been  performed  in  the  last  few  years  also  contains  a  number  of 
cutting  plane  methods  for  convex  programming,  nondifferentiable 
optimization,  integer  programming  and  semi-infinite  programming 
based  on  interior  point  ideas.  The  basic  idea  of  these  ’’interior  cut¬ 
ting  plane  methods”  is  easy:  work  with  a  linear  approximation  of 
the  feasible  set,  which  is  iteratively  updated.  In  each  iteration  a 
certain  interior  point  of  the  linear  approximation  is  computed;  if 
this  point  is  feasible  for  the  original  constraints  then  the  algorithm 
stops  (feasibility  problems)  or  an  ’’objective  cut”  is  added  (opti¬ 
mization  problems);  if,  on  the  other  hand,  this  point  is  infeasible 
then  a  cutting  plane  is  added.  The  methods  differ  e.g.  in  the  way 
the  specific  interior  points  are  computed  for  which  feasibility  is 
checked;  these  points  will  be  called  centers. 

Methods  that  fit  in  the  above  scheme  include 
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author(s) 

ref. 

center 

Elzinga,  Moore 
Goffin,  Vial  et  al. 
Vaidya 

Den  Hertog  et  al. 
Atkinson,  Vaidya 

[2) 

e  g-  (3) 
[8]-, 

[4,5] 

[1] 

ball  centers 
analytic  centers 
volumetric  centers 
minimizers  log.  barrier 
analytic  centers 

A  great  advantage  of  cutting  plane  methods  as  described  above 
is  that  in  each  iteration  only  linear  constraints  matter.  More¬ 
over,  successive  approximations  differ  only  slightly,  which  makes 
’’warm  starts”  possible.  Without  an  efficient  strategy  for  dropping 
constraints  the  number  of  constraints  in  the  linear  approximation 
would  however  grow  too  much. 

From  the  practical  point  of  view  Goffin,  Vial  et  al.  have  obtained 
very  good  results  with  their  implementations,  since  the  algorithm 
appears  to  be  very  stable.  Theoretically  the  best  algorithm  of  this 
form  is  Vaidya’s  (8),  which  uses  socalled  volumetric  centers.  The 
method  needs  a  number  of  iterations  which  is  bounded  by  0{mL) 
where  m  is  the  dimension  of  the  space  and  L  somehow  measures 
the  input  length  of  the  convex  program.  This  complexity  bound 
can  be  achieved  by  applying  a  very  good  and  natural  strategy 
for  dropping  constraints;  the  number  of  constraints  is  bounded  by 
0(m). 

Vaidya’s  Volumetric  Center  Algorithm 
Preliminaries 

Consider  the  problem  of  finding  y  €  C  C  H”*,  where  C  is  a  convex 
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set.  A  major  aissumption  made  is  that  the  set  C  is  contained  in  a 
ball  of  radius  2^  centered  at  the  origin  and  that  if  C  is  nonempty 
then  it  contains  a  ball  of  radius  2“^.  The  set  C  is  successively 
approximated  by  bounded  full-dimensional  polyhedra  of  the  form 
P  =  {  y  \  A^y  -Hs  =  c,  5>0},  that  C  C  P.  Conforming  to 
’standard  interior  point  notation’  we  denote  the  dimension  of  the 
space  by  m,  whereas  the  number  of  constraints  in  P  is  denoted  by 
71  (as  opposed  to  [8]).  We  denote  the  i  —  th  column  of  A  by  a,. 

Many  interior  point  methods  make  use  of  the  analytic  center  of  a 
poly  tope,  introduced  by  Sonnevend  [7],  which  is  the  maximizer  of 

n 

max{  </>(y)  =  ^ln(s.)  |  A^y  +  s  =  c  }. 

^  t=i 

The  hessian  of  <^(y)  is  given  by  P(y)  =  AS~^A^,  where  S  denotes 
the  diagonal  matrix  with  the  components  of  the  vector  s  on  its 
diagonal,  and  e  is  an  all-one  vector  of  appropriate  dimension.  The 
volumetric  center  as  introduced  by  Vaidya  is  now  defined  as  the 
minimize!*  uj  of  the  potential  function 

^(y)  ~  ^lndet(/f(y)). 

To  measure  the  distance  of  a  point  to  a  certain  constraint  the 
following  distance  measure  is  introduced  (this  measure  also  plays 
a  role  in  methods  based  on  analytic  centers,  e.g.  [4]): 
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It  can  easily  be  shown  that  <  1  for  y  6  -P.  The  smaUer 

cri(y)  the  further  is  the  constraint  from  y.  Moreover,  it  holds  that 
<Ji{yY  =  m,  since  the  (7,’s  are  the  diagonal  elements  of  a 
projection  matrix.  This  result  is  a  key  ingredient  to  establish  the 
complexity  bound. 

Algorithm 

Let  the  current  polytope  be  with  volumetric  center  o;^'.  More¬ 
over,  we  have  a  point  y^  approximating  Let  e  be  a  ’small 
constant’. 

The  algorithm  performs  iterations  of  the  form: 

Compute  r  :=  min,  <7,(y^)^.  Depending  on  the  value  of  r,  one  of 
two  cases  apply.  If  r  >  c  then  a  cutting  plane  is  added,  wliich 
is  computed  by  calling  an  appropriate  oracle.  The  constraint  is 
put  in  a  ’nice  position’.  On  the  other  hand,  if  r  <  e,  the  con¬ 
straint  which  is  furthest  away  from  the  current  center  is  dropped. 
In  both  cases,  a  Newton-type  procedure  is  performed  to  find  an 
approximation  to  the  new  center.  If  the  resulting  point  is  in  C 
then  stop. 

Analysis 

In  both  cases  of  the  algorithm  0(1)  Newton-type  steps  suffice.  The 
distance  measure  used  to  measure  closeness  to  the  exact  center  is  in 
fact  the  length  of  the  search  direction.  As  in  the  analysis  of  barrier 
methods  for  convex  programming  (e.g.  Nesterov  and  Nemirovskii 
[6]),  it  is  this  measure  that  plays  an  important  role  in  proving 
polynomiality. 
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The  function  F{y)  is  used  as  a  potential  function  to  measure  the 
volume  of  the  polytope.  Denote  by  y*  the  analytic  center  of  P. 
Then  since 

C  { y  I  (»  -  y-fHivKy  -  y')  <  } 


it  holds  that 

volume(/’)  <  (2nr(clet(^(y')))‘'^^  <  (2nr(det(5(u.)))-‘/2 

=  (2n)”'exp(-f(w)). 

Vaidya  has  proved  that  after  k  iterations 

f ‘(a,‘)  -  ^0(0,“)  >  j. 

This  is  enough  to  show  that  if  the  algorithm  does  not  stop  with  a 
feasible  point,  then  after  0{mL)  iterations  the  volume  of  P  must 
fall  below  2“*"^,  hence  C  is  empty. 

The  main  part  of  the  paper  [8]  is  a  very  technical  analysis  of  the 
effect  of  adding  and  dropping  constraints  on  the  potential  function, 
the  position  of  the  center  and  the  distance  measure.  Also,  the 
analysis  of  the  Newton  steps  requires  careful  analysis.  In  this  talk 
we  will  discuss  Vaidya’s  method,  which  has  very  good  theoretical 
results  but  seems  to  get  little  attention  in  the  field  of  interior 
point  methods.  We  will  try  to  give  some  insight  in  these  topics, 
and  stress  the  relationship  with  e.g.  Nesterov  and  Nemirovskii’s 
[6]  analysis. 
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Abstract 

We  present  a  new  strategy  for  performing  a  line-search  on  a  logarithmic 
barrier  function.  The  strategy  exploits  the  anaiyticity  of  the  barrier  function 
and  yields  a  simple  and  efficient  method  for  finding  the  minimum  of  the  barrier 
function  along  a  given  line.  For  our  theoretical  investigations  we  define  the 
notion  of  self-concordance  of  order  two  for  the  restriction  f  of  n  logarithmic 
barrier  function  to  the  real  line.  Based  on  this  notion  we  define  a  new  search 
step  for  the  line  search  and  prove  a  bound  on  the  size  of  the  new  search  step 
which  is  slightly  better  than  the  optimal  bound  known  for  the  case  of  a  self- 
concordant  function  (of  order  one).  We  conclude  with  some  numerical  examples 
that  illustrate  the  potentials  of  the  new  line  search. 

1.  INTRODUCTION 

Interior-point  methods  have  found  wide  interest  in  many  applications  in  the  recent 
past.  The  development  of  an  efficient  subroutine  for  a  line  search  is  a  very  important 
detail  of  the  implementation  of  interior-point  methods  for  solving  linear  or  nonlinear 
convex  programs  that  found  little  attention  so  far.  Even  though  the  line-search  is 
“only”  a  one-dimensional  problem,  it  is  far  from  trivial,  and  our  analysis  below  shows 
the  rich  structure  that  can  be  exploited.  We  devise  a  cheap,  cubically  convergent 
procedure  that  is  faster  than  a  suitably  damped  version  of  Newton’s  method.  The 
underlying  work  is  motivated  by  two  articles  by  Murray  and  Wright  [3,  4]  that 
investigate  the  same  problem  but  are  completely  different  in  their  approach. 

The  analysis  of  this  article  is  applicable  to  a  class  of  problems  with  non-linear 
constraints  that  includes  as  special  cases 

linear  and  quadratic  programming  problems 
and 

minimization  problems  over  the  cone  of  positive  semidefinite  matices. 

For  simplicity  we  present  our  analysis  for  the  case  of  a  linear  program,  and  out¬ 
line  briefly  the  modifications  that  are  necessary  for  extensions  to  other  classes  of 
problems.  Let 

V>(»t  m)  =  “  -  5^  log(^-  -  4 *)  (1-1) 

^  »al 

'Institnt  fur  Angewandte  Matliematik,  Uoimaity  of  Watsburg,  8700  Wunbnrg,  (West) 
Geimsay. 
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be  a  barrier  function  for  a  linear  program.  (The  same  analysis  also  holds  for  the 
barrier  function  ip{x,fi)  :=  c^xfft  —  '^logxi  with  the  additional  constraint  Ax  =  6.) 
Let  X  be  some  strictly  feasible  point  for  ip  and  h  be  some  search  direction  for  finding 
the  minimum  of  (p.  Without  loss  of  generality  we  assume  for  the  rest  of  this  article 
that  h  is  a  descent  direction  for  V’>_Define  the  function 

f(t)  =  ip{x  +  th).  (1.2) 

The  function  /  hence  depends  on  the  barrier  function  <p  as  well  as  on  z  and  h.  By 
the  assumption  on  h:  /'(O)  <  0. 


2.  KNOWN  RESULTS 

As  shown  in  [5]  (for  a  more  specialized  but  shorter  analysis  see  also  [1]),  for  any  x 
and  h  as  above,  the  function  /  is  convex  and  satisfies  the  following  self-concordance 
relation 

/'"(O)  <  2/"(0)^/^  (2.1) 

This  relation  is  true  for  functions  /  generated  by  a  large  class  of  logarithmic  barrier 
functions  including  the  two  special  cases  mentioned  above.  (A  slightly  more  gener¬ 
alized  case  is  to  replace  the  constant  2  in  (2.1)  by  some  other  positive  constant.)  In 
the  following  we  will  always  assume  that  /  satisfies  (2.1). 

It  is  our  goal  to  perform  a  line-search,  i.e.  to  find  the  minimum  of  /  for  some 
t  >  0.  The  following  statements  are  proved  in  {5]  for  functions  /  satisfying  (2.1): 


•  If  the  Newton  step 


At=  - 


m 

no) 


has  F-norm  less  than  one,  i.e.  if 


I|A«||h  :=  <  1 


(2.2) 

(2.3) 


holds,  then  /  has  a  minimum.  (Moreover,  also  the  barrier  function  tp  generat¬ 
ing  /  via  (1.2)  has  a  global  minimum.)  Here,  the  notation  H^-norm  is  derived 
from  the  higher  dimensional  case  where  an  analogous  result  holds  true  and  H 
stands  for  the  Hessian  of  /  (respectively  of  <p)  and  ||v||j^  =  y/v^D^<p{x,p)v. 
We  maintain  this  notation  even  though  At  is  only  a  scalar. 

•  Second,  if 

||t||H<l,  i.e.if  |t|<l/Vm  (2.4) 

then  t  is  strictly  feasible  for  /. 


These  results  already  supply  the  foundation  for  a  simple  step  length  rule  for  a 
line  search  and  are  also  true  for  the  higher  dimensional  case.  A  possible  rule  for 
the  search  step  s  referred  to  as  reduced  Newton  step  and  for  that  [5]  proved  global 


At 

*“i  +  ||A1||k- 


(2.5) 
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Lemma  1 

A  line  search  using  the  reduced  Newton  step  is  monotonously  convergent,  i.e.  the 
reduced  Newton  step  s  is  “a  little  too  short”. 

Proof: 

We  only  prove  monotonicity  of  the  line-search.  Let  a  =  /'(O)  <0,6  =  /"(O)  >  0, 
and  i  be  the  zero  of  g(t)  :=  We  show  that  s  <  t  holds  true  by  applying  the 

differential  inequality  (2.1)  (self-concordance  of  /)  to  the  function  g  =  /',  i.e. 

5(0)  =  a,  5'(0)  =  6,  g'\t)<2g\tfl\  (2.6) 

The  extremal  solution  v  that  satisfies  v"  =  2r'^/*  is  an  upper  bound  for  g.  (Since  g' 
satisfies  a  first  order  initial  value  inequality  it  follows  g'  <  v',  and  hence  also  g  <  v.) 
The  function  t;  is  given  by  t;(t)  =  —  t)“‘  —  a  —  6*/*  and  has  the  (unique)  zero 

s  =  o6“^/(l  +  a6“^/^).  By  construction  of  v,  we  may  conclude  that  s  <  t .  | 

In  the  following  we  will  further  exploit  the  properties  of  the  logarithmic  barrier 
function  and  obtain  additional  results  that  are  unpractical  in  higher  dimensions  but 
appear  to  be  useful  for  a  line-search. 

First  we  illustrate  with  a.  simple  example  the  difficulties  that  arise  when  trying 
to  devise  a  heuristic  for  a  line-search. 


3.  A  SIMPLE  EXAMPLE 

Consider  the  case  that  the  interval  (0, 1]  is  given  by  m  (identical)  constraints  of  the 
form  t<l(m>l)  and  the  constraint  t  >  0.  Even  though  this  appears  to  be  a 
very  artificial  example  the  situation  arising  here  is  typical  for  the  problems  that  are 
encountered  during  a  line-search.  The  resulting  barrier  function  is 

<p(t)  =  — mlog(l  -  t)  -  logt.  (3.1) 

Clearly,  the  minimum  of  this  function  is  at  t  =  1/m.  If  we  choose  an  initial  point 
for  Newton’s  method  for  finding  i  that  is  fairly  close  to  f,  e.g.  to  =  1/V^>  we  find 
that  the  Newton  step  starting  at  to  has  length  »  1/2.  This  means,  the  Newton  step 
is  by  a  factor  y/in  too  long,  and  the  resulting  point  is  outside  of  the  feasible  set.  If 
we  choose  a  starting  point  that  is  further  away  from  t,  e.g.  to  =  1/2,  we  find  to  our 
surprise  that  the  Newton  step  is  exact.  And  if  we  move  further  way  from  t  it  turns 
out  that  the  Newton  step  becomes  way  too  short.  Summarizing  we  see  that  moving 
from  to  =  1  to  to  =  0,  the  Newton  step  is  first  too  short,  then  too  long  (much  too 
long,  by  a  factor  of  0(y/m))  and  then  again  too  short.  Aso,  the  information  that 
the  If -norm  of  the  Newton  step  is  large  does  not  give  any  information  whether  the 
Newton  step  is  too  long  or  too  short.  Thus  it  is  hard  to  identify  the  domain  in 
which  the  Newton  step  contains  some  useful  information,  and  the  domain  in  which 
bisection  (and  backtracking)  would  be  the  best  choice  to  minimize  /.  Note  further, 
that  a  seemingly  “good”  Newton  step  starting  at  to  =  0.45  that  is  only  by  a  factor 
1.1  too  long  still  yields  an  infeasible  point  and  is  therefore  useless.  In  the  next 
section  we  describe  a  property  that  may  help  us  to  further  analyze  /  and  to  develop 
new  results  that  improve  the  rate  of  convergence. 
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4.  SELF-CONCORDANCE  OF  ORDER  TWO 

Similarly  to  the  defiiution  of  self-concordance,  we  may  observe  a  second  property  of 
the  logarithmic  barrier  function,  namely  that 

rm  <  6nt)\  (4.i) 

This  property  will  be  referred  to  as  self-concordance  of  order  2.  The  proof  that  this 
estimate  is  valid  for  the  logarithmic  barrier  function  of  linear  constraint  functions  is 
straightforward:  Clearly,  if  f(t)  satisfies  property  (4.1)  then  so  does  f(rt  -f  s)  with 
r,  s  constants,  r  0,  i.e.  the  property  is  invariant  under  affine  transformations. 
Also,  if  fi  and  /a  satisfy  (4.1)  then  so  does  fi  +  /j,  (and  for  r  >  0,  rf  satisfies 
a  similar  property  with  6  replaced  by  6/r).  Since  the  relation  is  trivially  true  for 
f{t)  =  logt  this  completes  the  proof.  Similarly  we  obtain  for  the  restriction  to  the 
real  line  of  the  logarithmic  barrier  function  of  a  semidefiniteness  constraint 

f(t)  =  -logdet(A--f-tr)  =  -logdetJC-logdet(/-|-X-‘y)  (4.2) 

with  a  positive  definite  matrix  X  and  a  symmetric  matrix  Y  that 

/('')(<)  =  (-l)*(/b  -  1)!  trace(((X  -|-  (4.3) 

and  that 

6/"(0)^  =  6  trace(B)2  >  6  trace(5*)  =  /""(O)  (4.4) 

where  B  =  is  positive  semidefinite.  (Note  that  by  trace(A.B)  = 

trace(5A)  we  may  commute  a  factor  {X  +  tY)~^f^  to  the  right  for  our  convenience.) 

5.  ADDITIONAL  RESULTS 

This  section  is  motivated  by  the  fact  that  higher  order  derivatives  are  easily  com¬ 
putable  for  functions  /  generated  by  the  logarithmic  barrier  function  of  linear  con¬ 
straints  and  of  the  determinant.  If  /  is  generated  by  the  function  (p  in  (1.1)  as  in 
(1.2),  i.e. 

m 

fit)  =  c^(z  +  th)  - 

(=1 

the  derivatives  of  /  are  given  by 


(5.1) 

(5.2) 

for  k  >2.  The  “expensive”  part  (0(mn))  is  to  compute  all  the  scalar  products  afx 
and  ajh,  but  once  these  are  known,  it  is  comparatively  cheap  (0(m))  to  compute 
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higher  order  derivatives^.  We  point  out  that  exactly  the  same  situation  holds  for 
the  logarithmic  barrier  function  /  of  (4.2)  with 


where  the  A,  are  all  the  eigenvalues  of  the  matrix  {X  +  tY)~^Y. 

The  cheap  computation  of  allows  two  modifications  of  the  line  search.  Rather 
than  taking  the  Newton  step  At  (2.2)  for  the  line-search,  (with  some  suitable  damp¬ 
ing  factor  as  above)  we  suggest  to  use  the  zero  of  the  second  order  approximation 
to  f.  More  precisely,  let 

a  =  /'(O)  <0,  b  =  /"(O)  >  0,  and  c  =  /'"(O).  (5.3) 

Define  the  discriminant  D  —  max{0, 6^  —  2ac}  and  let 

dt= -2a/{b+y/D)  (5.4) 


be  the  cubic  search  step.  (It  is  trivial  to  derive  that  for  b^  —  2ac  >  0  this  value  of  dt 
is  one  of  the  zeros  of  the  second  order  approximation  to  f,  i.e.  it  is  a  critical  point 
of  the  cubic  approximation  to  /.) 

Clearly,  the  cubic  search  step  will  be  locally  cubically  convergent,  but  here,  our 
interest  is  rather  on  how  to  extend  the  domain  of  rapid  convergence  compared  to 
damped  Newton’s  method.  This  goal  is  addressed  in  our  second  suggestion  below 
for  modification  of  the  line  search. 

Before  we  continue,  we  briefly  recall  some  known  results  (taken  from  [6])  about 
the  Weierstrafl  p-function  that  will  be  used  below.  The  Weierstrafi  p-function  is 
determined  by  two  paramters  §2  and  gs.  We  are  only  interested  in  the  case  that  p  is 
defined  by  the  invariants  §2  =  0  and  93  =  1  (this  is  referred  to  as  the  equianharmonic 
case  in  the  literature)  and  in  real  arguments  for  p.  For  this  case,  the  description  of 
p  involves  the  following  constants: 

U2  =  «  1.529954037,  ej  =  V2  =  — ^  «  0.59276269754.  (5.5) 

4ir  S/4  2w2V3 


The  function  p  is  periodic  with  period  2w3,  and  has  poles  at  t  =  2m4;2«  n  €  Z. 
F\irther,  for  t  £  (0,  2ci;3):  p(t)  >  63.  Numerically,  p  can  easily  be  evaluated  by  its 
power  series  expansion.  We  list  two  possible  expansions  for  p  : 


t^'^  28*  10192 


1 


5422144 


-I-  0(f") 


22', 


(5.6) 


for  f  6  (0, 1)  e.g.,  and 


CCS 

pit  -b  0^3)  =  e2  +  3e2z[l  +  x  +  +  -x^  -|-  -x^  -|-  -x^  0(z®)]  (5.7) 


where  x  =  63^^.  (Both  p(t)  and  p(u;3-i-t)  are  even  functions.)  The  above  expansions 
also  yield  the  expansions  for  p'  and  for  the  Weierstrafi  C-function,  with  ('  =  -p. 

‘For  linear  constraints,  it  is  also  dteap  to  evaluate  the  function  /  and  its  derivatives  for  other 
values  of  I,  once  /  is  valaluated  for  t  s  0. 


301 


(The  constant  term  of  C  in  (5.6)  is  zero,  in  (5.7)  it  is  173.)  It  is  known,  that  p  satisfies 
the  following  differential  equation 

p'*  =  4p^  —  1,  p"  =  6p*. 


Finally,  also  the  inverse  functions^o  p  and  to  (  are  known  by  their  power  series 
expansions.  Let  3  :  [63,  00)  — »  (0,  U2\  he  the  inverse  function  of  p,  then 


X  -i/2r,  “  3u*  7u*  63u®  231ti®  429u^  ,  ... 

0(z)  -  z  +  7  26  38  40  248  592  688 


where  u  =  z“^/8  and  z  >  0.9.  For  z  €  {63,  0.9)  the  following  expansion  may  be 
used: 

1  I  z^  z^  2z^ 

0(63  +  Z)  =  U2-  —  J3Z  +  C>(zS). 


£3  3e2  21e2 


The  inverse  function  S  :  — ►  (0,  2^3)  to  ^  has  the  power  series 

7  _  177^  4967^ 


for  z  >  0.75  and  7  =  l/20z®,  and 


3553 


+  0(7^)] 


(5.8) 


»(z)  =  U2-  ^(1 

62 


267y 

35 


139y* 

5 


30192y^ 

275 


1634208y^ 

3575 


+  0(i)] 


where  y  =  (z  —  T72)^/e3  and  (z  —  »^|  is  small.  Finally,  it  holds  that  <^{z  +  7^2)  = 
C(z)  +  2t73  (p  is  periodic  but  C  is  not!)  and  that  C  is  an  odd  function.  FY’om  this 
we  obtain  the  relation  f}(z)  =  2a;3  —  S2(2>73  -  z)  which  allows  us  to  evaluate  K  for 
z  <  0.45  e.g.  via  (5.8). 

We  now  return  to  the  problem  of  increasing  the  step  length  for  the  line  search. 
Let  v{t)  :=  Seif-concordance  of  order  two  implies  that  t>"(t)  <  6v(t)*.  We  are 

given  v(0)  =  6  and  v'(0)  =  c  and  intend  to  determine  the  domain  of  the  function  v 
for  positive  t  as  well  as  a  lower  bound  of  the  location  of  the  minimum  of  v.  As  in 
the  proof  of  Lemma  i,  the  structure  of  this  differential  inequality  implies  that  the 
solution  of  the  initial  value  problem 


w"(t)  =  6w(ty,  u;(0)  =  6  >  0,  u>^(0)  =  c  (5.9) 

is  an  upper  bound  for  v.  The  general  solution  of  (5.9)  is  the  p-function  with  in¬ 
variants  93  =  0  and  93  €  A.  In  our  case,  the  solution  can  be  reduced  to  the 
equianharmonic  case  93  =  0,  93  =  by  the  transformation 

u>(t)  =  Q*p(at  -I- 1) 


with 


a 


v/45^  -  c*  and 


/  9(Va’) 

\  20^3  -  5(6/a2) 


’The  caee  that  g2  =  9i  ^0  yields  p(()  s  1/(1  —  <)’. 


C  <  0 
C>  0  ■ 


It  is  easily  veriiied  that  w  indeed  satisfies  (5.9).  Since  v  <  w,  the  pole  of  v  must  lie 
behind  the  one  of  w  which  gives  us  the  following  Lemma: 

Lemma  2 

If  a  and  x  are  given  as  above,  t  >  0  and 

II^IIh  =  ty/b  <  -  x) 

a 

then  t  is  strictly  feasible^. 

More  important  yet,  we  may  also  derive  a  lower  bound  on  the  minimum  i  of 
/:  Let  F  be  an  antiderivative  to  v  e.g.  V{t)  =  — +  ®))*  The  value  s  with 
F(s)  —  F(0)  =  /o*  v{t)  At  =  — a  is  a  lower  bound  for  t  that  is  better  than  the  bound 
derived  in  Lemma  1  and  can  easily  be  computed  using  the  function  given  above. 
We  obtain 

s  =  a-^(»(-  +  C(x))-x),  (5.10) 

a 

and  will  refer  to  i  as  the  reduced  cubic  search  step.  The  construction  of  s  implies 
the  following  lemma. 

Lemma  3 

The  reduced  cubic  search  step  s,  the  reduced  Newton  step  s,  and  the  minimum  t  of 
/  satisfy  the  relation 

0  <  5  <  s  <  f. 

Moreover,  the  reduced  cubic  search  step  is  the  maximum  possible  step  that  is  less 
than  i  for  given  values  of  /,  /',  /"  and  at  f  =  0. 

Unfortunately,  also  the  reduced  cubic  search  step  s  (as  well  as  the  cubic  search 
step  dt)  does  not  grant  a  satisfactory  rate  of  convergence  for  all  starting  points  to 
(e.g.  not  for  to  >  0.5  in  the  example  of  Section  3).  We  therefore  suggest  a  simple 
modified  cubic  search  method  for  the  first  step  of  the  line  search.  Set  the  initial 
search  step  s  =  s  to  the  reduced  cubic  search  step.  Then  repeatedly  quadruple  s 
as  long  as  s  is  feasible  for  /  and  the  absolute  value  of  f\s)  decreases.  In  the  next 
section  we  compare  the  different  methods. 

6.  SOME  NUMERICAL  EXAMPLES 

In  Table  1  we  compare  the  reduced  Newton  method,  the  reduced  cubic  method,  and 
a  combined  cubic  method  that  uses  the  modified  cubic  search  step  (outlined  at  the 
end  of  the  previous  section)  in  the  first  iteration  (to  come  dose  to  the  minimum), 
and  then  switches  to  the  reduced  cubic  method.  Each  row  lists  the  starting  point 
to  and  the  average  number  of  function  evaluations  needed  by  each  method  for  20 
random  examples  with  5000  constraints  for  the  interval  [0,  1].  (We  assume  that 
the  derivatives  are  cheap  once  /  is  evaluated.)  In  these  examples  100  constraints  lie 
in  the  interval  [—10,  0]  and  4900  in  the  interval  [1,  2].  The  minima  of  the  barrier 
functions  lie  around  t  sa  0.0003.  As  final  accuracy  we  require  ||At||tf  <  0.01.  (This 
correponds  to  about  10  digits  relative  accuracy  for  these  examples.) 

*Foi  lineal  or  aemideiiiiite  conatoainta  thia  bonnd  ia  merely  Cot  illnatration  aince  the  exact  domain 
ia  known. 
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to  red.  Newton  red.  cubic 
10-®  29.2  5.1 

0.1  9  8 

0.999  127.7.  104.7 

-  Tkble  1 


comb,  cubic 
12.2 
6.6 
39.2 


Since  we  are  interested  in  the  globcU  convergence  behaviour  we  chose  starting 
points  that  all  have  a  poor  accuracy  (rdative  to  the  feasible  domain  [0,  1]).  Clearly, 
the  last  row  has  a  very  bad  starting  point.  Below,  in  Table  2  we  list  an  example  with 
5000  constraints  and  a  linear  objective  ^t  There  are  2500  constraints  in  [-10,  0]  and 
2500  in  [1,  11].  The  value  i  is  the  approximate  minimum,  and  is  almost  constant 
for  the  20  random  examples  in  each  row. 


A* 

^0 

i 

red.  Newton 

red.  cubic 

comb,  cubic 

10-* 

0.1 

10'® 

12.9 

11.5 

8.7 

10-® 

10“® 

10“® 

21 

3.6 

8.2 

10-10 

0.1 

o 

1 

o 

13.4 

11.9 

9.4 

10-10 

10“*^ 

10-10 

27 

3 

8 

Table  2 
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A  Projected  Conjugate  Gradient  Method 
for  Sparse  Minimax  Problems. 
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A  new  method  for  nonlinear  minimax  problems  is  presented.  The  method  is  of  the  trust 
region  type  and  based  on  sequential  linear  programming.  It  is  a  first  order  method  that  only 
uses  first  derivatives  and  does  not  approximate  Hessians.  The  new  method  is  well  suited 
for  large  sparse  problems  as  it  only  requires  that  software  for  sparse  linear  programming 
and  a  sparse  symmetric  positive  definite  equation  solver  are  available.  On  each  iteration  a 
special  linear! quadratic  model  of  the  function  is  minimized,  but  contrary  to  the  usual 
practice  in  trust  region  methods  the  quadratic  model  is  only  defined  on  a  one  dimensioruil 
path  from  the  current  iterate  to  the  boundary  of  the  trust  region.  Conjugate  gradients  are 
used  to  define  this  path.  One  iteration  solves  an  LP  subproblem  and  requires  three  junction 
evaluations  and  one  gradient  evaluation.  Promising  numerical  results  obtained  with  the 
method  are  presented.  In  fact  we  find  that  the  number  of  iterations  required  is  comparable 
to  that  of  state-of-the-art  quasi-Newton  codes. 


1.  Introduction 

This  short  paper  is  based  on  the  report  [4],  and  we  shall  frequently  refer  to 
this  for  details.  Our  main  purpose  is  to  present  a  new  method  for  solving 
sparse  minimax  problems  of  the  type 

min  F(x)  where  F(x)  =  max  cTx).  (1.1) 

The  functions  Ci:R"  R  are  smooth. 

Minimax  problems  occur  frequently  in  engineering  and  science,  and  they 
are  often  both  large  and  sparse.  Such  problems  arise  in  microwave  circuit 
design,  satellite  antenna  design,  digital  filtering  and  optimal  truss  design,  to 
name  a  few  applications.  Another  important  use  of  a  minimax  method  is  to 
solve  constrained  nonlinear  progranuning  problems,  and  we  return  to  this 
in  section  3. 

We  are  not  aware  of  any  methods  specificaUy  designed  for  large  and 
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sparse  minimax  problems,  but  several  methods  for  dense  problems  exist  in 
the  litterature.  A  list  of  many  of  these  may  be  found  in  [4].  Some  of  the 
methods  are  first  order  methods,  that  are  based  on  linearizing  the  functions 
Cj  and  the  rest  are  second  order  methods,  in  the  sense  that  they  are  based  on 
the  Hessian  of  a  Lagrange  function.  Many  of  the  methods  approximate  this 
Hessian  using  quasi-Newton  updates  but  this  has  the  serious  disadvantage 
that  the  commonly  used  updates  quickly  produce  a  full  matrix,  even  for 
sparse  problems.  The  approach  is  therefore  prohibitively  expensive  for 
large  sparse  problems.  An  alternative  is  to  use  the  exact  Hessian  or  a  finite 
difference  approximation  to  it,  and  some  of  the  methods  do  that.  This  may 
however  also  be  too  expensive  for  large  problems.  An  additional  disad¬ 
vantage  of  the  second  order  methods  is  that  most  of  them  rely  on  solving 
quadratic  programming  (QP)  subproblems,  and  methods  for  sparse  QP  are 
not  readily  available. 

These  drawbacks  are  avoided  by  the  first  order  methods.  The  use  of  a 
Hessian  is  avoided  completely  and  no  QP  problems  are  solved.  It  is  well 
known  that  for  certain  problems,  methods  based  on  linearization  will  be 
quadratically  convergent.  These  problems  have  a  strongly  unique  solution, 
which  means  that  all  directions  from  the  solution  are  strictly  uphill. 
Strongly  unique  solutions  are  necessarily  such  that  the  maximum  in  (1.1)  is 
attained  for  at  least  n+1  of  the  functions  Cj.  The  shortcoming  of  these  basic 
first  order  methods  is,  that  if  the  solution  sought  is  not  strongly  unique, 
then  the  convergence  may  be  quite  slow. 

At  any  point  x  the  functions  Ci  that  attain  the  maximum  in  (1.1)  are  called 
active,  and  the  active  set,  S(x),  is  the  corresponding  set  of  indices  i.  In 
the  neighbourhood  of  a  non-degenerate  solution  x*  the  condition  S(x)  = 
S(x*)  determines  an  (n+l-t''')-dimensional  differential  manifold  M,  and 
this  will  be  called  the  active  manifold.  An  important  fact  is  that  the  re¬ 
striction  of  F  to  the  active  manifold  is  a  smooth  function. 

The  application  of  a  basic  first  order  descent  method  quickly  gives  a  point 
on  or  near  the  active  manifold,  but  convergence  inside  the  manifold  is  slow, 
unless  the  problem  is  specially  simple.  There  are  two  reasons  for  this  slow 
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convergence.  Firstly  such  a  method  has  problems  with  keeping  within  the 
manifold,  because  a  search  direction  found  using  linearization  will  be  tan¬ 
gent  to  the  manifold.  The  second  reason  is  only  relevant  if  the  manifold  is 
at  least  2-dimensional.  We_then  find  that  even  if  the  manifold  is  linear  the 
search  direction  need  not  be  in  the  direction  of  the  minimum  on  the  mani¬ 
fold  and  a  steepest-descent-like  zigzagging  may  occur.  These  reasons  are 
discussed  further  in  [4]. 

In  [3]  a  first  order  method  that  avoids  the  first  cause  of  slow  convergence 
is  given.  That  method  is  of  the  trust  region  type  and  solves  the  same  LP 
subproblems  as  the  method  given  in  [5]  for  a  basic  step.  If  the  basic  step 
goes  uphill  and  away  from  the  active  manifold,  a  special  corrective  step 
back  toward  the  manifold  is  tried.  The  method  uses  one  gradient  evaluation 
per  iteration,  and  on  the  test  problems  that  we  tried  it  used  an  average  of 
1 .35  function  evaluations  per  iteration. 

In  the  present  method  we  attempt  to  avoid  both  causes  of  slow  conver¬ 
gence.  The  method  is  still  purely  first  order  and  again  uses  only  one  gradi¬ 
ent  evaluation  per  iteration,  but  now  three  function  evaluations  are  needed. 
It  will  however  often  be  the  case  that  gradient  evaluations  are  significantly 
more  expensive  than  function  evaluations,  so  this  should  not  be  too  much  of 
a  detriment.  We  have  used  conjugate  gradients  to  avoid  the  steepest  descent 
like  zigzagging  inside  the  manifold.  In  addition  we  use  a  simple  linesearch 
along  an  active  arc  to  obtain  the  next  iterate. 

2.  The  method 

In  this  section  we  give  a  rather  informal  description  of  the  algorithm.  A 
precise  description  is  given  in  [4].  In  what  follows,  x  is  the  current  iterate 
and  p  is  the  current  value  of  a  trust  region  radius.  The  gradient  of  ci  at  x  is 
a  row  vector  denoted  by  c{(x).  The  method  uses  sequential  LP  and  each 
iteration  begins  with  solving  the  following  linear  minimax  subproblem 

min  max  (CjCx)  +  Cj’(x)  dLp) 

s.t.  IldLpIL  ^  p  . 


(2.1) 
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which  is  equivalent  to  an  LP  problem.  We  need  a  current  estimate  of  the 
active  set  S(x*)  and  we  take  this  to  be  the  active  set  of  (2.1)  at  the  solution 
and  denote  it  by  S.  On  most  iterations  this  is  the  only  information  that  is 
used  from  (2.1),  but  occasionally  it  is  advantageous  to  take  the  step  d^p. 
The  current  estimate  of  the  active  manifold  will  then  be 

M  =  {y  I  Cj(y)  =  Cj(y),  Vi,j€a) 

(or,  perhaps  more  precisely,  some  connected  subset  of  this). 

In  trust  region  methods  based  on  sequential  QP,  it  is  common  that  an 
iteration  involves  minimizing  a  quadratic  model  of  the  objective  function 
inside  the  trust  region.  What  we  do  is  somewhat  similar.  On  each  iteration 
we  minimize  a  model  function,  but  our  model  function  is  only  defined  on  a 
one-dimensional  path  from  x.  The  path  consists  of  two  parts: 

(1)  A  line  segment  from  x  to  a  point  Xp  on  or  near  M. 

(2)  An  active  arc  y  that  approximately  follows  M  from  Xp  to 
the  boundary  of  the  trust  region. 

We  first  determine  a  projective  step  pi,  that  goes  approximately  the 
shortest  distance  from  x  to  M.  From  the  requirement  x  +  p  i  e  M  we  get 
Ci(x+Pi)  =  constant,  i  e  21.  First  order  Taylor  expansion  gives 

Cj(x)  +  Cj'(x)Pi  =  constant,  i  e  21  (2.2) 

and  Pj  is  defined  as  the  minimum  norm  solution  to  (2.2).  The  method  is 
designed  so  that  on  most  iterations  x  is  close  to  M  and  Pi  is  (therefore) 
short.  From  Xp  :=  x  pi  we  find  a  direction  s  which  is  approximately 
tangent  to  the  manifold  M.  On  the  first  iteration,  and  on  iterations  when  a 
restart  is  made,  s  is  the  steepest  projected  descent  direction.  On  successive 
iterations  s  is  found  using  conjugate  projected  gradients.  The  linear  space 
T  =  {y  I  Cj'(x)y  =  Cj'(x)y  Vi,je21)  is  approximately  the  tangent  space  to 
M  at  X.  Let  P  be  a  matrix  that  projects  onto  T.  For  i  e  21  the  projected 
gradient  of  Cj  at  x  is  PCjXx).  This  turns  out  to  be  independent  of  i  and  we 


define  the  projected  gradient  to  be  g  =  PCj'(x),  where  i  e  9  is  arbitrary. 
The  direction  s  is  -g  when  a  restart  is  made.  Otherwise, 

s  =  -g  +  pPsiast  where  |3  =  ^^~LglasP..g  _  (2.3) 

and  Sjjjj  and  gi^st  are  the  values  of  s  and  g  on  the  previous  iteration.  Here 
we  have  adapted  the  Polak-Ribiere  formula.  We  restart  with  s  =  -g  if  one 
of  the  following  occurs: 

(1)  the  active  set  S  has  changed  from  the  last  iteration, 

(2)  the  conjugate  gradient  direction  turns  out  to  be  uphill,  (2.4) 

(3)  no  progress  was  made  on  the  last  iteration. 

We  define  d  to  be  a  vector  in  the  direction  of  s  with  length  equal  to  some 
estimate  of  the  optimal  steplength.  One  may  for  instance  use  the  length  of 
the  step  taken  on  the  previous  iteration.  From  Xp  +  d  we  find  a  new  pro¬ 
jective  step  P2,  toward  M,  in  exactly  the  same  way  as  pj  was  found.  Details 
of  the  linear  algebra  needed  to  obtain  p,,  s  and  P2  are  given  in  [3]  and  [4]. 
The  active  arc  y  is  now  defined  by  7(t)  =  td+t^P2.  We  approximate  F  along 
p  1  with  a  linear  function  L  and  along  y  with  a  parabola  Q  so  that 
F(x+tpj)  =  L(t),  F(xp+  Y(t))  =  Q(t)  and  L(l)  =  Q(0).  Finally  the  next 
iterate  is  found  by  minimizing  the  approximation  given  by  L  and  Q.  If  the 
minimum  is  not  at  x  it  is  on  the  arc  y.  We  therefore  let  tmin  be  the  mini- 
mizer  of  Q  and  choose  Xp  +  7(t,„jn)  as  the  new  iterate,  if  the  objective  func¬ 
tion  is  reduced  sufficiently.  If  it  is  not  possible  to  find  a  new  iterate  by 
searching  along  the  path  defined  by  Pj  and  y,  the  last  attempt  on  this  iter¬ 
ation  is  to  try  the  point  x+d^p.  If  this  also  fails  to  give  a  sufficient  reduc¬ 
tion  in  F  then  x  is  unchanged  on  the  next  iteration. 

The  final  task  of  the  iteration  is  to  update  the  trust  region  radius  p.  In 
common  with  other  trust  region  methods  this  is  based  on  the  quotient 

^  _  actual  reduction  in  F  _  F(x)  -  F(x„g^^) 
predicted  reduction  in  F  L(0)  -  Q(lnun) 

If  r  is  close  to  1  there  is  good  agreement  between  the  model  and  F,  and  we 
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increase  p.  If  r  is  small  the  agreement  is  bad,  and  we  decrease  p. 

The  modus  operand!  of  an  iteration  is  illustrated  in  fig.  2.1  for  the  case 
n=3,  t=2.  We  now  summarize  the  foregoing  description  in  an  algorithm. 
Several  details  are  left  out,  and  may  be  found  in  [4].  . 

while  not  stop  do 

solve  the  linear  subproblem  (2.1)  for  d^p  and  find  S  and  M 

find  the  projection  p  j  using  (2.2). 

find  d  using  (2.3),  (2.4)  and  the  surrounding  discussion 

find  the  projection  P2  using  (2.2)  with  x  replaced  by  x  +  pi  +  d 

minimize  the  approximation  Q  along  y  and  thereby  determine  t^^jn 

5<-Pi+7(tmin) 

if  F(x+8)  <  F(x)  then  x  x+5 

else  if  F(x+dLp)  <  F(x)  then  x  x+d^p 
update  p  based  on  r  from  (2.5) 
end  while 


Fig.  2.1  The  steps  of  an  iteration 
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3.  Numerical  results 

The  new  method  has  been  applied  to  some  standard  test  problems  and  the 
results  are  described  here.  For  comparison  some  other  methods  have  also 
been  tried.  So  far  we  have  not  applied  the  method  to  very  large  problems. 
The  reason  is  that  we  don't  have  a  sparse  code  yet,  and  furthermore  our 
philosophy  is,  that  if  the  method  should  be  able  to  solve  large  problems,  it 
is  necessary  that  standard  small  test  problems  are  solved  efficiently. 

As  we  said  in  the  introduction  constrained  optimization  problems  may  be 
solved  by  a  minimax  algorithm.  For  cr  >  0  small  enough,  the  problem 

niin  f(x)  s.t.  Ci(x)  <  0  i=l,...,m-l  (3.1) 

is  equivalent  to  minimizing  the  exact  penalty  function  F(x)  s 
maXi^j^jn(^^(^)  +  Cj(x)),  where  c^Cx)  =  0.  We  believe  that  this  is  an 
important  and  often  overlooked  way  of  solving  constrained  optimization 
problems. 
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Table  3.L'  The  number  of  iterations  to  reach  an  accuracy  of  10~^  in  F 
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The  test  problems  that  we  have  used  are  described  in  [4].  The  first  two 
test  problems  are  of  the  type  (3.1)  and  the  rest  are  of  the  type  (1.1).  The 
methods  that  we  have  used  for  comparison  are  listed  below.  More  detailed 
references  may  be  found  in  [4]._ 

j6NASSON  &  MADSEN  The  method  described  in  [3]. 

MADSEN  '75  The  method  of  Madsen  [5]. 

HALD  &  MADSEN  The  two  phase  method  of  [2]. 

WATCHDOG  An  SQP  method  of  M.J.D.  Powell  (the  routine  MINCF  of  [6]). 

LANCELOT  The  optimization  package  described  in  [1]. 

The  results  of  the  test  runs  are  summarized  in  Table  3.1.  Each  entry  in 
the  table  gives  the  number  of  iterations  to  reach  an  accuracy  of  10“*  in  the 
objective  function  (see  [4]  for  details  of  the  stopping  criterion). 

The  test  results  show  clearly  that  the  new  method  is  quite  competitive 
with  the  more  complicated  second  order  methods,  at  least  on  this  (small) 
sample  of  test  problems.  The  second  order  methods  all  maintain  a  quasi- 
Newton  approximation  to  the  Hessian,  and  thus  both  storage  requirements 
and  work  per  iteration  will  be  higher.  We  also  see  that  the  new  method 
gives  considerable  improvement  over  the  other  first  order  methods. 
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The  purpose  of  this  paper  is  to  summarize  the  main  model  management  as¬ 
pects  of  stochastic  Unear  programming  and  to  outUne  some  basic  features 
of  SLP-IOR,  a  model  management  system  being  under  development  at  the 
Insitute  for  Operations  Research  of  the  University  of  Zurich.  For  model  man¬ 
agement  systems  in  OR  see  Dolk  [2],  and  for  SLP-IOR  see  KaU  and  Mayer 
[4],  [5],  [6].  The  discussion  wiU  be  focused  on  single-  and  two-stage  models 
these  being  the  model-classes  incorporated  into  the  first  version  of  SLP-IOR. 

The  models  considered  are  as  foUows,  for  details  see  Kail  [3]. 

Two-stage  models 


min  {c'^x  +  E^Q{x,  ^)} 

s.t.  Ax  oab  (1) 

X  G[/,u], 

with  ^  €  H  C  IR^,  P)  is  a  given  probability  space;  a  means  that  any 

one  of  the  relations  >  ,  <  or  =  is  permitted,  componentwise;  and 


Q{x,0  = 


min  q'^iOy 
s.t.  WiOy 

y  >0 


(2) 
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where 


■  9«)  =  9° 

HO  = 

+  T.U 

T(f)  =r“ 

+ 

11 

O 

II 

(3) 


Problem  (2)  is  called  the  second-stage  (  or  recourse  )  problem;  it  can  be  inter¬ 
preted  as  accoimting  for  violations  in  the  random  equations  h{^)—T{^)  x  =  0. 
The  matrix  VP'(^)  is  called  the  recourse  matrix  of  the  model. 

The  set  of  equations  (3)  serve  for  modeling  the  way  in  which  random  vari¬ 
ables  influence  the  problem. 

Chance-constrained  (  or  probabilistic-constrained)  models 


a)  .Joint  constraint 


min  Ec^(^)x 

P({(\Tx>h{m  >a 

Ax  (xb 

X  G  (/,  u] 

b)  Separate  constraints 

inin£?c’^(Ox 

I  <?«)*  >  '>.(0))  >  «.■.  vt 

Ax  oc  6 

X  €[/,«] 


(5) 
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In  models  (4)  and  (5)  a,  Vi  are  the  prescribed  probability  (  reliability  ) 
levels  for  the  fulfillment  of  the  random  inequalities. 

Prom  the  model  management  point  of  view  the  following  mmn  constituents 
Ccin  be  identified. 

Underlying  algebraic  structure 

Neglecting  randomness  in  the  models  above  (  e.g.  by  replacing  all  random 
variables  by  their  expected  value  )  results  in  deterministic  LP-models  which 
are  considered  as  the  underlying  algebraic  structure  in  the  hnear  case. 

Underlying  random- variable  structure 


This  stucture  corresponds  to  the  random  variables  j=l,...,k,  and  is  de¬ 
termined  by  the  dependency-structure  and  probabihty  distributions. 

Underlying  regression  structure 


The  coimection  of  the  two  previous  structures  is  established  via  the  linear 
affine  relations  (3).  The  terms  in  the  affine  sums  determine  the  way  in  which 
randomness  appears  in  the  model. 

The  mziin  model-management  features  of  stochastic  hnear  programming  can 
be  summarized  as  follows. 

(  1  )  Deterministic  LP  is  a  specied  case  of  stochastic  hneax  programming 
(SLP).  This  imphes  that  a  model  management  system  for  SLP  should  in¬ 
clude  or  at  least  facilitate  the  use  of  the  powerful  model  management  tools 
and  systems  available  for  LP. 

(  2  )  SLP-models  are  hard  to  solve  numericaUy:  Their  solution  involves  nu¬ 
merically  difficult  NLP-problems  (including  multivariate  integrals)  or  large 
scale  LP  or  mixed-integer  problems.  The  various  solution  approaches  sub¬ 
stantially  depend  on  the  type  of  model  and  on  the  random-variable  and 
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regression-structures.  Selecting  an  appropriate  solver  is  a  key  issue  here: 
Solver  performances  for  a  given  numerical  model  may  differ  b>'  several  orders 
of  magnitude,  including  also  virtually  infinite  solution  time. 

(  3  )  Most  of  the  solvers  are  designed  for  a  determinisitic  equivalent  of  the 
original  SLP-model.  This  means  that  developing  a  model-solver  interface  is 
not  merely  a  question  of  data-format,  a  model-transformation  is  also  involved. 
This  transformation  depends  on  the  model-type  and  on  the  underlying  struc¬ 
tures  as  well  as  on  the  specific  solver. 

When  considering  a  DSS  containing  also  an  SLP-component.  features  (  2  ) 
and  (  3  )  imply  that  a  reliable  implementation  would  most  likely  involve  a 
full-scale  MMS  component  for  handling  the  difficulties  outlined  above. 

(  4  )  Analysis  features  (  needed  also  for  selecting  a  solver  )  have  to  account 
for  the  algebraic  structure  but  should  also  include  special  features  for  SLP, 
e.g.  to  determine  whether  a  recourse  matrix  is  of  the  complete-recourse  type, 
or  whether  a  covariance  matrix  is  positive  definite.  For  this  kind  of  model 
manipulation  operators  see  also  Wallace  and  Wets  [7]. 

(  5  )  The  system  architectxire  should  facilitate  extensions  into  the  direction 
of  nonhnear  or  multi-stage  stochastic  programming  models. 

Main  features  of  SLP-IOR 

The  basic  idea  is  to  build  SLP-IOR  around  an  existing  algebraic  modeling 
system;  GAMS  has  been  chosen  for  this  purpose,  see  Brooke  et  al.  [1].  This 
approach  provides  us  with  an  excellent  tool  for  handling  the  vmderlying  alge¬ 
braic  structure  zmd  also  facihtates  extensions  for  nonhnear  models.  GAMS 
serves  as  a  uniform  interface  facility  for  solvers  and  supports  the  important 
issue  of  model  dociunentation.  Powerful  general-purpose  solvers  are  also 
available  with  GAMS.  They  serve  for  comparing  solver  performance. 

The  model  manipulation  operators  are  implemented  on  three  levels: 

-  The  elementary  entites  in  SLP-IOR  are  the  various  arrays  and  random 
variables  appearing  in  the  models.  Typical  manipulation  operators  are  load. 
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store,  edit,  show. 

-  The  next  level  consists  of  the  underlying  algebraic-,  random  variable- 
and  regression-structures.  Besides  the  operators  for  elementary  items  some 
further  operators  like  inject^  into  a  model,  extract  &om  a  model,  sample  or 
discretize  (for  random  variables  )  are  also  available. 

-  On  the  highest  level  are  the  models  themselves.  High  level  operators 
are  e.g.  solve,  randomly  generate,  perturb,  transform  into  another  type,  ex¬ 
port/import. 

Elementary  items  and  the  underlying  structmral  elements  serve  as  building 
blocks  for  models. 

The  model  library  contains  a  collection  of  test  problems  from  the  hteratiure. 
We  plan  to  endow  SLP-IOR  with  a  wide  vzu'iety  of  solvers,  for  details  see 
Kail  and  Mayer  (4]. 

The  user  interactions  are  performed  via  an  interactive  menu-driven  interface. 
A  different  approach  is  proposed  by  Gassmann  and  Ireland  for  scenario-based 
SLP  models.  They  define  an  extension  of  an  algebraic  modeling  language  to 
accoimt  also  for  these  SLP  models. 

To  facilitate  extension  by  new  model-types  or  solvers  the  system  is  built  in 
an  object-oriented  style.  The  model  management  activities  ,  the  models,  the 
various  array-types,  the  distributions  as  well  as  the  solvers  are  implemented 
in  class-hierarchies.  Rules  concerning  the  models,  or  the  model-solver  con¬ 
nection  are  implemented  as  polymorphic  Boolean  functions. 

The  present  version  of  SLP-IOR  is  implemented  for  IBM/PC  AT  computers, 
endowed  with  an  arithmetic  coprocessor  eind  having  at  least  SMB  storage. 
Programming  language  is  Turbo  Pascal  6.0.  To  improve  portability  a  C-h-H 
version  will  also  be  developed. 
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Extended  Abstract  : 

Several  problems  such  as  blending  problems,  bundle  pricing 
problems,  logic  design  and  circuits  problems,  etc.  can  be 
formulated  as  linear  programs  with  logical  constraints. 

A  logical  constraint  can  be  viewed  as  a  constraint  (or  a 
set  of  constraints)  that  is  connected  with  a  kind  of  logical 
function  that  gives  and  indicates  for  which  solutions  this 
constraint  (or  set  of  constraints)  must  be  fulfilled  and  for 
which  solutions  not.  Two  basic  logical  constraints  can  be 
distinguished  : 

1.  Elther-Or  constraints  ;  all  the  constraints  of  q  (with 

1  £  q  <  p)  out  of  p  subsets  of  constraints  must  be  satisfied, 

but  it  is  not  necessary  to  satisfy  all  constraints  of  all  p 

subsets.  If  all  constraints  are  linear,  an  Either-Or  constraint 

can  be  formulated  as  : 

EITHER-OR  (E  a^l  X,  <  b^‘  i«l,.,m'’‘  |l=l,.,p'') 
q*-  out  p**  J-1  i  ' 


I 
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2.  Conditional  constraints  :  a  set  of  constraints  must  not 
necessarily  be  met  by  all  feasible  solutions;  the  constraints 
of  this  set  must  only  be  met  by  those  feasible  solutions  that 
fulfil  a  specified  condition.  If  all  constraints  involved  in 
the  condition  and  in  the  set  are  linear,  a  Conditional 
constraint  can  be  formulated  as  : 


IF  {MiffAMo  as"  X.  <  6^1))  =  1 

THEN  £  a^;  X;  <  b^  i  =  l,..,m‘‘ 
j*i  J  ' 


Where  : 

(,f,  »!i  Xi  ^  “Si  -  1  ,f,  »!i  Xj  “  “! 

=  0  otherwise 

A  linear  programming  model  with  m  constraints,  r  Either-Or 
constraints  and  o  Conditional  constraints  (each  Conditional 
constraint  contains  m*^  constraints  and  r*^  Either-or 
constraints  )  can  be  formulated  as  : 


Min  £  c,  X, 
j=i  J  J 

E  a*  f  Xt  $  bi 
j»i  'J  J  ' 


i  ^  l,..,m 

i^l  1  Bihl  I  1 


EITHER-OR  (Z  a^  x,  <  b^‘  i=l,.,m'’‘|  1=1,., p' 
qh  out  p**  j’’  ‘ 

n 

IF  MAXo,  (Mlfl^  {[Z  a),  x,  <  6l[] } )  =1 


)|  h=l,.,r 


THEN 

£  aj.  X,  <  bj 
j*i  'J  >  ’ 

i  =  1 , . . ,  m*' 

EITHER-OR  {£ 
q*'*'  out  p'"'* 

akhl  ^  ^  1—1  vskh 

1-1,., m  1  1-1,.., p 

)l 

h=l, . .  ,r*' 

x^  > 

0 

3  —  l,..,n 

k=l , . , o 
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Each  instance  of  the  above  model  for  linear  programming 
with  logical  constraints  can  be  reformulated  as  a  mixed  integer 
program  by  using  certain  basic  modelling  tools.  This 
reformulation  proves  the  solvability  of  the  proposed  model  and 
provides  a  first  algorithm. 

A  new  algorithm  for  solving  directly  the  Instances  of  the 
model  is  constructed  based  on  following  observations  : 

1.  The  region  of  feasible  solutions  consists  of  several 
separated  (or  adjoining)  convex  subregions. 

2.  The  region  of  feasible  solutions  of  the  relaxed 
problem  -this  is  the  linear  program  without  the 
logical  constraints-  is  a  closed  convex  cover  of  the 
subregions  mentioned  in  observation  1  if  at  least 
one  of  these  subregions  is  not  empty;  otherwise  it 
is  empty. 

3.  At  least  one  of  the  optimal  solutions,  if  there  are 
solutions,  is  among  the  basic  solutions  of  the 
convex  subregions.  In  other  words,  at  least  one  of 
the  optimal  solutions  is  among  the  basic  solutions 
formed  by  aIX  the  linear  inequalities  in  (1) . 

4.  The  SIMPLEX-algorithm  moves  from  one  basic  solution 
to  another  neighbour  basic  solution. 
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The  concept  of  this  new  algorithm  is  that  of  the  cutting  plane 
algorithms;  where  the  objective  function  is  used  to  obtain  the 
cut-off-constraint.  The  algorithm  can  be  described  by  the 
following  steps  : 

Step  0  :  initial  solution. 

Take  as  initial  basic  solution (s)  X*  and  as  initial  value  of 
the  objective  function  z*,  the  optimal  basic  solution(s)  and 
corresponding  value  of  the  objective  function  of  the  relaxed 
problem  : 

Min  £  C;  X; 
j*i  ‘  ‘ 

I,  a,j  Xj  <  b,  i  =  l,..,m 

Xj  >  0  j  =  l,..,n 

If  there  is  a  solution  for  the  relaxed  problem  then  go  to  step 
1;  else  stop. 

Step  1  :  Check  feasibility. 

Check  if  one  of  the  X*  is  feasible;  in  other  words  check  if  one 
of  the  X*  fulfils  all  the  logical  constraints. 

If  so,  then  X*  is  the  optimal  solution  and  the  algorithm  stops. 
IF  not,  go  to  step  2. 

Step  2  t  Reduce  convex  cover 

Determine  for  each  basic  solution  x|  the  value  of  the  objective 
function  in  its  neighbour  basic  solutions  that  are  within  the 
feasible  region  of  the  relaxed  problem.  (Those  basic  solutions 
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are  formed  by  all  the  linear  inequalities  of  the  LP-CCEO) . 
Select  as  new  z*  the  value  with  the  smallest  strict  positive 
increase. 

Add  the  following  constraint  to  the  relaxed  problem  : 

n  • 

2  C,  X.  >  Z 
j=1  J  > 

Reoptimize  ?nd  take  as  solution (s)  X*  the  new  optimal  solution (s) 
-if  they  exist-  and  go  to  step  1;  if  there  are  no  solution  then 
stop. 


A  first  comparison  between  the  two  algorithmic  solution- 
methods,  on  the  criteria  "Required  memory-space"  and  "Observed 
computational  time",  will  be  made. 
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Extended  Abstract 

Consider  a  firm  that  owns  a  stock  of  capital  goods  K  through  which  it  can  produce 
Q{K)  and  it  holds  that  <?(0)  =  0,  Q'  >  0,  Q"  <  0.  Q  can  be  sold  on  the  market 
against  a  fixed  price  p.  It  is  assumed  that  within  this  production  process  pollution  is 
hardly  controllable.  Rather  it  is  subject  to  uncertainties  of  various  sorts  which  cannot 
technologically  be  controlled  easily  or  would  require  extremely  large  investments.  This 
occurs  for  example  in  situations  where  there  is  a  small  probability  of  polluting. 

We  assume  that  pollution  occurs  following  a  compound  Poisson  process  and  approximate 
this  process  by  a  diffusion  (Wiener)  process.  Namely,  let  A  be  the  rate  at  which  pollution 
occurs.  Thus,  the  probability  of  pollution  occurring  in  a  time  interval  dt  is  Xdt.  Given 
that  pollution  occurs  let  F{-)  be  the  density  of  the  pollution  size.  It  seems  reasonable  to 
assume  that  mean  and  variance  of  this  size  increase  with  production  and  therefore  we 
suppose  that  the  pollution  size  has  mean  aQ{K)  (a  >  0  and  constant),  where  a  is  the 
expected  emissions-output  ratio  (cf.  Dasgupta  (1982),  p.  20),  and  variance 
(<T  >  0  and  constant).  A  mean-variance  diffusion  normal  approximation  to  this  process 
is  well  known  (see  Tapiero  (1984),  Tapiero  and  Zuckermann  (1982))  and  is  given  by 
XaQ(K)dt  and  XcrWQ^iqdt. 

Therefore,  if  we  normalize  the  cost  per  unit  of  pollution  to  one  and  dz  is  the  (stochastic) 
pollution  damage  in  dollars  in  the  time  interval  dt,  its  diffusion  approximation  is 


dz  =  XaQ(K)dt  +  y/X<7aQ{K)db, 


(1) 
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where  <U>  is  a  standard  Wiener  process. 

The  firm  can  partly  insure  itself  against  the  uncertainty  of  the  emission  cost,  but  a 
drawback  is  that  the  premium  rate  includes  a  loading  factor,  which  is  often  used  by 
insurance  firms  to  account  for  the  dollar  margin  paid  for  risk  protection  (see  Tapiero 
(1985)).  If  the  part  is  insured  is  denoted  by  6  and  6  stands  for  the  loading  factor,  then 
the  premium  rate  equals 

(1  +  ^)0Aag(A:)  (2) 


and  the  emission  cost  arising  from  the  part  that  is  not  insured  is  given  by 
(1  -  0)XaQiI<)dl  +  71(1  -  0)(XoQ(I<)db 


(3) 


In  the  model  we  suppose  that  6  is  exogeneously  determined  and  we  will  try  to  establish 
how  the  solution  is  influenced  when  0  increases  or  decreases. 

Capital  stock  is  of  the  non-depreciating  type  and  can  be  increased  by  investment: 

dK  =  Idt  (4) 


In  our  quest  to  obtain  analytical  results  we  leave  financing  possibilities  like  borrowing 
and  issuing  new  shares  aside.  If  we  let  the  cash  process  of  the  firm  include  all  transactions 
(such  as  dividend  distribution,  investments,  returns,  insurance  premiums,  emission  costs) 
we  obtain  the  following  state  equation  for  the  cash  balance: 

dM  =  {pQ{I<)  -  {I  -  e)XaQ{I<)  -  (I  +  6)9XaQiK)  -  I  -  D}dt+ 

-  0)Q{I<)db. 


The  firm  behaves  as  if  it  maximizes  the  shareholders’  value  of  the  firm  which  can  be 
expressed  as  the  mathematical  expectation  of  the  discounted  dividend  stream  over  the 
planning  period.  Hence,  the  objective  function  becomes: 


maximize 


Eo 


[  Dexp{—it)dt 
Jo 


(6) 
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The  horizon  date  T  is  determined  such,  that  it  equals  bankruptcy  time.  We  assume  that 
the  firm  is  bankrupt  as  soon  as  the  cash  balance  becomes  negative,  which  is  expressed 
in  the  following  equation  for  the  horizon  date: 

r  =  inf{f|A/(0  <  0}  (7) 

The  assumption  of  irreversibility  of  investment  and  the  nonnegativity  of  the  dividend 
rate  are  captured  by  the  following  inequalities: 

D  >  0  (8) 

/  >  0  (9) 


It  is  assumed  that  the  firm  does  not  spend  more  on  investment  and  dividend  than  the 
revenue  net  from  expected  pollution  expenses: 

D  +  /  <  pQ{K)  -  (1  -  0)\aQ{K)  -  (1  +  6)eakQ{K)  (10) 


To  summarize:  the  model  contains  two  state  variables  K  and  M,  two  control  variables 
I  and  D,  and  can  be  expressed  as  follows: 


maximize 


Eo 


Dexp{—it)dt 


(11) 


s.t. 


dK  =  Idt,  f<’(0)  =  I<o 

dM  =  \{p-il-e}aX-{l+6)X0a}Q{K)-I-D]dt+ 
yA<TQ(l -0)Q(I<)db,M{O)  =  Mo 

D>0 


(12) 

(13) 

(14) 


/>0 


(15) 
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D  +  I<{p-il-  e)Xa  -  (1  +  6)X0a)Q(K) 


(16) 


For  this  model  to  be  well  defined  it  is  necessary  that  the  right-hand  side  of  (16)  is  positive. 
In  order  to  make  sure  that  this  is  the  case  for  ail  0  €  [0, 1]  we  introduce  the  following 
additional  assumption: 


p  —  Aa(l  4-  i)  >  0 


(17) 


(17)  implies  that  the  revenue  from  selling  one  product  exceeds  the  premium  per  product 
to  be  paid  when  the  firm  is  fully  insured  against  the  uncertainty  of  the  emission  cost. 

In  order  to  facilitate  the  notation  we  introduce  the  following  symbols: 


u  =  p  —  (1  —  6)aX  —  (1  -f  6)6aX 

(18) 

ff  =  <Ta(l  —  $)y/X 

(19) 

If  u  =  1  and  &  =  cr  our  model  reduces  to  the  original  Bensoussan  and  Lesourne  model 
with  irreversible  investment  (see  Bensoussan  and  Lesourne  (1980)).  Like  in  that  model, 
also  here  we  have  three  candidate  policies  for  optimality: 

Investment  policy:  I  =  vQ(K),D  —  0; 

Cash  Policy:  /  =  D  =  0; 

Dividend  policy:  I  =  Q,D  =  vQ{K). 

Given  the  parameter  values,  it  completely  depends  on  the  values  of  the  state  variables 
M  and  K  which  of  the  three  policies  is  optimal  for  the  firm  to  carry  out  (due  to  the  fact 
that  the  horizon  time  is  state  dependent  the  optimal  solution  does  not  depend  on  time). 
Therefore  we  divide  the  M  —  K  plane  in  three  regions:  investment-region  (/),  cash-region 
(A/)  and  dividend-region  (D).  From  the  analysis  of  Bensoussan  and  Lesourne  (1980)  (see 
also  Van  Hilten,  Kort  and  Van  Loon  (1992),  Chapter  11)  we  obtain  the  most  realistic 
solution  which  is  depicted  in  Figure  I. 
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Figure  1.  The  most  realistic  solution. 

The  solution  of  Figure  1  is  only  optimal  when  a  certain  condition  on  the  parameter  val¬ 
ues  is  met.  In  the  paper  we  establish  in  what  way  this  condition  and  the  configuration 
of  this  solution  is  influenced  when  we  change  the  values  of  the  insurance  part  9  and  the 
pollution  occurrence  probability  A. 
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Extended  abstract 

1.  INTRODUCTION 

In  the  last  fifty  years  mathematical  programming  has  been  playing  a  vital  rote  in  studying  ihc 
behaviour  of  and  solving  problems  related  to  complex  systems  consisting  of  numerous  intei  related 
objects  and  activities.  New  disciplines  like  operations  research  and  management  science  as  well  as 
new  branches  of  established  ones  had  emerged  to  cope  with  die  rather  difficult  development  of 
general  and  field  specific  adaptations  of  models,  methods  and  algorithms. 

By  the  end  of  1970s  and  more  characteristicaUy  during  the  1980s  the  mathematical  programming 
based  modelling  and  problem-solving  had  to  face  new  challenges.  This  was  due  to  increasing  demand 
raised  by  computerization,  automation  and  industrial  development  Advances  in  computers  and  com¬ 
puter  science  provided  better  hardware  and  software  tools  on  the  one  hand,  but  created  concurrent 
disciplines,  paradigms  and  system  architectures  for  supporting  human  problem  solving  and  decisions 
on  the  other.  Object  oriented,  functional  and  logic  programming;  artificial  intelligence;  decision 
support  systems,  deductive  databases,  knowledge  based  and  expert  systems  to  name  but  a  few. 

Many  of  the  mathematical  programming  models  and  algorithms  are  very  powerful  and  efficient 
indeed,  but  a  retrospective  inspection  might  reveal  apparent  difficulties  of  their  present  applications, 
some  of  which  has  been  observed  and  discussed  by  practitioners  in  many  years.  IVo  of  them  should 
serve  as  an  illustration  here.  Perhaps  the  most  severe  symptom  is  the  long  development  time  of  the 
basic  model  and  reasonably  efficient  algorithm  variant,  even  if  standard  elements  are  built  into  it.  Ali 
substantial  modifications  and  extensions  require  rather  long  time.  Another  inherent  difficulty  is  the 
handling  of  approximate  or  missing  dau  and  an  incomplete  model,  in  the  sense  that  cenain  aspects, 
properties,  relations,  types  of  constraints  not  yet  known,  but  realized  later  during  problem  solving. 

The  present  paper  covers  alternative  approaches  to  modelling  and  problem  solving  by 
mathematical  programming  and  by  logic  programming.  The  logic  programming  paradigm  is  shortly 
reviewed  together  with  the  principles  nondetemiinism  and  unification.  Problem  solving  is  considered 
as  a  knowledge  based  search  in  a  search  tree  which  is  unfolding  through  the  problem  solving  process 
in  the  extent  needed  to  determine  all  the  solutions.  The  conceptual  discussion  on  differences  and 
similarities  in  the  process  of  development  in  the  two  disciplines  will  be  illustrated  by  parallel 
examples.  Advantages  and  difficulties  of  both  approaches  are  discused.  An  attempt  is  made  to 
combine  the  two  approaches  in  several  difTerem  ways. 
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2.  TERMINOLOGY 

The  main  subject  of  the  present  paper  is  the  support  of  human  problem  solving  and  decisions  by 
computer  models  and  automatic  or  interactive  problem  solving  systems.  The  central  focus  of  attention 
is  the  comparison  of  methodologiesjdeveloped  around  paradigms  of  mathematical  programming  and 
logic  programming  respecdvely.  In  o^r  to  get  compact  discussions  and  clear  views  a  terminology  is 
introduced  reflecting  different  roles  and  elements,  that  are  important  in  the  process  of  building  up 
such  support  systems.  For  example  the  word  ‘problem’  has  tree  basically  different  meanings  in  our 
context.  It  can  be  the  initial  problem  of  a  human  decision  maker,  one  of  the  mathematical  problems 
within  the  computer  model  and  a  difficulty  arising  a:  any  stage.  These  will  be  labelled  as  ‘decision 
problem’,  ‘problem  and  ‘difficulty’.  In  the  rest  of  the  paper  the  following  terminology  will  be  used: 

Expert  IS  the  human  decision  maker  who  has  knowledge  and  expertise  on  his  field  and  capable  of 
producing  answers  to  questions,  solutions  to  problems  arising  within  his  area,  making  high  quality 
decisions  of  different  kind.  He  is  the  one  who’s  decisions  are  to  be  supponed  the  way  he  needs  it. 
Naturally,  the  word  ‘expen’  might  stand  for  a  male/fcmale  person,  a  group  of  persons  or  perhaps 
even  an  entire  organization.  In  different  fields  or  context  he  might  be  called  decision  maker,  manager, 
leader,  engineer,  designer,  scientist  etc. 

Decision  problem  is  a  problem  the  expert  is  facing  and  wants  to  solve  according  to  his  knowledge 
and  the  expectations  of  his  closer  and  wider  community.  His  result  or  solution  to  the  decision 
problem  will  be  called  decision,  even  if  it  is  a  plan,  an  estimate  or  a  diagnose.  Accordingly,  all 
problem  solving  activities  of  the  expert  is  called  decision  process. 

Supporter  is  the  person  or  group  who  carries  on  discussions  with  the  expert,  makes  a  model 
being  a  reflection  of  the  real  world  around  the  decision  problems  which  is  captured  in  a  certain  extent 
by  problem  described  within  the  model  and  builds  up  a  problem  solver  or  shortly  a  solver,  capable 
of  producing  a  solution  to  the  problem.  The  process  of  finding  the  solution(s)  is  the  problem 
solving  process  regardless  of  automatic  or  interactive.  The  solver  together  with  a  computer  environ¬ 
ment  for  communication,  interaction,  input/output,  report  writing,  ctc.will  be  called  support  system. 

When  mathematical  programming  is  used  for  decision  support  it  is  combined  with  a  certain 
environment,  which  is  often  models  and  methodology  of  operations  research  or  management  science. 
Such  modelling  environments  for  logic  programming  are  often  knowledge  systems,  expert  systems 
or  artificial  intelligence.  These  categories  differ  from  each  other  both  in  focus  and  in  methods  and 
techniques,  but  ate  closely  related.  In  the  further  discussion,  operations  research  and  knowledge 
systems  are  chosen  respectively  as  typical  environments. 

3.  PROBLEM  SOLVING  VIA  MATHEMATICAL  PROGRAMMING 
3.1  What  is  the  problem? 

In  the  first  phase  of  modelling  it  is  vital  to  understand  what  kind  of  decision  problems  the  expert 
is  facing  and  most  importantly,  what  are  the  actual  difficulties  of  the  expert,  influencing  the  quality  of 
the  decisions  he  is  coming  up  with  and  the  resources  required  through  the  decision  process. 

The  supporter  with  a  mathematical  programming/operations  research  background  is  looking  for 
decision  problems  or  subproblems,  that  can  be  formulated  as  optimization  problems  perhaps  through 
an  operations  research  model,  preferably  with  a  good  efficient  algorithm.  The  variables,  the 
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constraints  and  an  objective  funcdon  are  identified,  the  latter  by  selecting  the  most  important  measure 
of  quality  of  the  decision.  Other  alternative  measures  are  kept  for  secondary  optimization  or  as 
constraints  with  parametric  level  of  sadsfaction.  If  there  are  measures  almost  equally  imponant,  then 
usually  the  weighted  sum  of  objecdves  is  used  as  an  objecdve  funcdon.  The  expert  is  supposed  to 
give  suggestions  for  the  weights.  The  expert  is  to  be  convinced,  that  enormous  perspectives  are 
opened  into  the  structural  examination  of  Ac  problem  he  never  have  hoped  for 

32  Modification  of  the  model  and  the  algorithm 

The  requirements,  constraints,  conditions  and  variables  of  the  decision  problem(s),  that  are 
identifiable  within  the  model  framework,  but  cannot  be  handled  by  the  existing  efficient  aigorithms, 
are  reconsidered  for  inclusion  in  a  modified  optimization  model.  Linear  approximation,  aggregation, 
decomposition,  simulation,  relaxation  are  examples  of  such  techniques.  In  fact,  these  are  attempts  to 
get  a  reasonably  efficient  algorithm  for  the  extended  problem,  but  in  this  phase  algorithmic  and  model 
changes  are  almost  inseparable. 

33  Using  real  data 

In  the  next  phase  the  modified  and  usually  extended  algorithm  should  be  tested  against  real  data, 
using  combinations  of  possibilities  built  into  the  support  system  in  the  course  of  basic  development 
Very  often  the  standard  source  of  data  (e.g.  regular  database)  is  insufficient  and  special  data  collection 
is  needed.  Perhaps  the  expert  could  not  do  as  extensive  data  manipulations  as  done  by  many  of  the 
suppon  systems  easily.  (‘The  human  is  not  a  machine.’)  But  the  opposite  can  also  be  mic.  Seme 
human  mental  activities  might  be  difficult  to  follow  by  the  machine,  thus  a  large  amount  of  date  and 
very  complicated  algorithms  may  be  necessary  to  cope  with  tasks  easy  for  humans.  ( ‘The  machine  is 
not  a  human  being.’)  Additionally,  the  algorithm  may  behave  very  differently  on  almost  random  data, 
than  on  real  data.  Field  -  or  problem  specific  modifications  of  the  algorithm  may  become  advisable  or 
even  necessary  to  get  an  acceptable  performance. 

3.4  Testing  the  model  and  algorithm  for  real  decision  problems 

There  is  almost  never  enough  time  for  off-line  testing  of  the  support  system  on  real  decision 
problems,  i.e.  simulating  the  entire  decision  environment  and  full  decision  process  for  the  purpose  of 
testing  the  system.  Usually  only  a  few  quite  artiticial  or  at  least  isolated  test  problems  are  created  and 
used  over  and  over  again.  Many  of  the  requirements  and  demands  become  obvious  only  when  the 
system  is  in  regular  use,  leading  to  serious  difficulties  and  even  to  rejection  of  the  system,  if  it 
reaches  this  stage  at  all. 

33  Forming  and  standardizing  the  support  system 

In  the  final  phase  the  standardization  of  the  input  and  output  of  the  models  and  algorithms  is  done. 
The  user  guide  describes  how  to  use  the  system,  including  the  parameters,  their  standard  settings, 
other  possible  values  and  their  modifying  effects  on  the  algorithms  (step  sizes,  precisions, 
frequencies  of  numeric  corrections,  selection  of  heuristics  and  other  strategies,  etc.) 
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4.  REFLECTIONS  ON  THE  MATHEMATICAL  PROGRAMMING  APPROACH 

Advantages  ofoptimization/operaaons  research  approach 

•  It  is  supponed  by  a  vast  amount  theoretical  research  results. 

•  More  then  forty  years  of  experience  in  modelling  and  problem  solving  including 

implementations  of  algorithms  and  code  optimization. 

•  Many  professional  operations  research  group,  management  science  and  other  groins  develop 

and  reuse  models  for  a  large  variety  of  applications. 

•  There  are  many  well  understood  arxl  efficient  algorithms  for  standard  q)timization  problems, 

available  also  in  software  packages. 

Difficulties  with  the  optimization  based  decision  process  support 

Dispite  of  all  these  facts,  several  symptoms  indicate  serious  difficulties  with  introducing 
optimization  based  suppon  systems  into  the  real  practice  in  cases  when  the  application  field  is  in  a 
dynamically  changing  environment  or  the  types  of  decision  problems  cannot  be  described  in  advance 
or  some  of  the  data  required  by  the  models  is  not  available.  These  symptoms  include  unreasonably 
long  modelling  and  systems  development  time,  strong  difficulties  with  modifying  the  model  or  the 
view  of  (he  problem  solving.  There  are  often  claims  about  the  flexibility  of  the  decision  support  the 
system  can  provide  considering  starting  points,  priorities,  special  condidons,  amount  and  kind  of  data 
requested  by  the  support  system  or  the  variety  of  solutions  obtained  at  the  end  of  a  session. 
Interacuve  decision  suppon  and  problem  solving  seems  to  be  pardcularly  difficult  The  decision 
problems  should  be  specified  in  terms  of  the  mathemadcal  problems  of  the  suppon  system. 
Intenacing  the  two  different  way  of  thinking  is  difficult  and  rigid. 

5.  THE  THREE  PILLARS  OF  LOGIC  PROGRAMMING 

5.1  What  is  logic  programming? 

The  basic  idea  of  logic  programming  is  to  describe  properties  and  relations  of  the  objects  and 
concepts  of  the  area  of  our  present  interest  (‘the  universe  of  discourse’)  in  terms  of  predicate  logic 
and  getting  answers  to  our  questions  and  solutions  to  our  problems  by  the  help  of  a  logic  reasoning 
system.  The  questions  or  problenas,  posed  to  the  logic  program,  are  called  goals.  In  other  words  the 
reasoning  system,  called  inference  engine,  is  supposed  to  find  the  condition  under  which  our  goal  is  a 
consequence  of  the  logic  program.  Then  this  condition  is  a  solution  of  our  goal  (problem).  If  there  are 
several  solutions,  the  inference  engine  is  supposed  to  find  all  of  them.  The  strength  of  logic 
programming  is  provided  by  the  three  main  components  called  unification,  non-determinism  and 
inference  engine,  which  are  informally  summarized  below. 

5.2  Unification 

Suppose,  it  is  given  a  goal  or  subgoal  G  which  is  to  be  solved.  From  a  high  level  viewpoint, 
unification  has  three  main  functions.  Firstly,  to  find  one  or  more  knowledge  description  K,  that  is 
applicable  for  solving  G.  Secondly,  if  the  knowledge  K  is  too  general  for  goal  G,  then  it  should  be 
specialized  appropriately.  If  the  knowledge  K  is  not  sufficiently  general  to  handle/solve  the  goal  G, 
then  the  goal  should  be  appropriately  specialized.  Usually  specialization  of  knowledge  K  and  goal  G 
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is  done  simultaneously  by  dw  unification  process.  It  should  be  noted,  that  specializing  a  subproblem 
by  a  condition  implies,  that  the  rest  of  the  problem  must  be  specialized  the  same  way.  If  the 
application  of  knowledge  K  to  solve  goal  G  is  not  successful  or  alternative  solutions  are  wanted,  then 
other  applicable  knowledge  descriptions  are  used,  according  to  the  paradigm  of  non_detenninism. 

5  J  Non-determinism  — 

Informally,  non.determinism  means,  that  alternative  definitions,  properties  and  relations  of  the 
objects  as  well  as  alternative  methods,  heuristics,  strategies  and  tactics  may  be  used  in  the  logic 
program.  As  long  as  at  least  one  combination  of  these  alternatives  can  positively  support  the 
reasoning  system  (inference  engine)  finding  all  solutions  of  the  stated  goal  is  guaranteed.  In  the 
opposite  case  the  inference  engine  should  be  able  to  prove,  that  on  the  basis  of  the  knowledge 
provided  by  the  logic  program,  there  is  no  solution  to  the  goal. 

5.4  Inference  engine 

The  main  task  of  the  inference  engine  is  to  show  whether  a  stated  goal  is  a  consequence  of  a  given 
logic  program,  representing  the  present  knowledge  about  the  field  under  consideration.  It  is  an 
essential  part  of  the  task  to  find  gradually  the  conditions  under  which  the  goal  is  provable  from  the 
logic  program,  because  this  set  of  conditions  is  considered  the  solution  of  the  stated  goal/problem.  All 
alternative  solutions  should  be  determined. 

55  Prolog  as  a  logic  programming  based  computer  language 

In  sections  5.2-54  some  of  the  basic  principles  of  logic  programming  were  indicated.  The  Prolog 
language  inherits  the  declarative  semantics  of  logic  programming.  Prolog  has  a  procedural  semantics 
as  well,  which  property  and  the  inference  engine  makes  it  executable  on  a  computer.  This  does  not 
mean  however,  that  the  Prolog  language  is  procedural,  though  -  being  a  general  purpose  computer 
language  -  one  can,  and  occasionally  does  write  procedures  in  Prolog.  A  more  detailed  explanation  is 
beyond  the  scope  of  the  present  paper,  but  one  can  say  that  the  vast  majority  of  a  good  Prolog 
program  is  declarative  in  nature.  This  means,  that  it  is  not  a  series  of  instructions  to  the  computer,  but 
a  set  of  definitions  of  objects,  their  properties  and  relationship.  It  is  ‘not  doing  anything’  until  the 
goal/subgoal  is  stated.  After  this,  the  logic  is  interpreted  in  such  a  way,  that  the  goal  is  proven/solved 
by  the  inference  engine.  That  activity  might  require  very  different  kinds  of  reasoning,  depending  on 
what  the  goal  is.  Shortly,  from  the  point  of  view  of  the  run-time  behaviour  of  the  logic  program,  it 
can  be  considered  as  a  condensed  description  of  many  potential  algorithms,  not  necessarily  known 
before  the  goal/problem  is  stated.  Even  the  same  goal/subgoal  might  be  reasoned  about  in  different 
ways,  often  resulting  in  several  conceptually  different  solutions. 

6.  PROBLEM  SOLVING  VIA  LOGIC  PROGRAMMING 

The  logic  programming  methodology  makes  it  natural  to  think  in  terms  of  field  of  interest,  initial 
problems  with  gradually  discovered  additional  properties  and  relations  of  its  objects,  and  finally, 
knowledge  needed  to  interpret  the  kind  of  problems  we  want  to  understand  and  solve,  rather  then 
defining  a  panicular  mathematical  problem  and  an  algorithm  to  solve  it.  Incomplete  problem 
descriptions  and  incomplete  information  is  very  common  in  (real  life)  decision  processes.  The  main 
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point  is  to  advance  the  problem  solving  and  the  problem  defmition  in  parallel.  It  is  usually  difficult  or 
even  practically  impossible  to  foresee  all  the  requirements,  conditions,  criteria  and  relations  of 
elements  in  the  as  yet  unknown  ‘solution’  to  the  decision  situation,  which  are  often  highly  dependent 
upon  the  kind  of  solutions  we  are  trying  to  impose. 

7.  REFLECTIONS  ON  THE  LOGIC  PROGRAMMING  APPROACH 

Advaruages  of  the  logic  programming  approach 

High  problem  solving  power  including  automatic  or  interactive  ways  of 

•  Finding  applicable  knowledge  within  the  knowledge  base 

•  Knowledge  and  problem  specialization 

•  Problem  decomposition 

•  Using  and  combining  alternative  definitions,  relations  and  methods 

•  Applying  problem  solving  strategies  and  collected  experience 
« Conflict  resolution 

Problem  formulation  matters 

•  Problems  need  not  be  specified  at  development  time 

•  Requirements  are  not  restricted  to  have  a  specific  mathematical  form 

•  Incomplete  problems  or  incomplete  information  can  be  handled 

•  A  concept  level  language  can  be  developed  fen*  problem  description 
Generality,  flexibility  and  extendability  of  the  suppcHi  system  is  ensured  by 

•  Declarative  programming  for  human  beings  and  computer  interpretation 

•  High  modularity  and  flexible  communication  between  sentences,  the  basic  program  units 

•  Self  standing  sentences  accepting  many  input/ouq)ut  patterns 
Rapid  prototyping  and  incremental  systems  development  supported  by 

•  Automatic  knowledge  inclusion,  search  and  specialization 

•  Alternative  definitions  of  objects,  their  properties  and  relations 

•  Meta-programming  facilities 
Concept  development  support,  including 

•  Natural  language  interpretation  facilities 

•  Defining  new  concepts  in  terms  of  relations  of  existing  concepts 

•  Representing  different  ways  of  thinking  and  reasoning 

•  Flexible  repnesentadon  of  knowledge  and  meta-knowledge 

•  Finding  contradictions  and  resolving  conflicticts 

Difficulties  with  the  logic  programming  approach 

The  traditional  software  development  technology  is  often  inadequate 

•  The  analysis-planning-coding-testing  cycle 

•  Open  or  hidden  procedurality,  including  program  control  elements 

•  A  concept  description  serving  present  purposefs)  only 

•  Precise  program  specifications  may  exclude  hidden  intentions,  usual  testing  is  insufficient 
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Keeping  a  high  level  of  generality/flexibility  with  improving  efficiency 

•  The  ever  improving  efficiency  should  come  mainly  from  more  knowledge 

•  Lxigic  level  efficiency  is  also  iII^x)rtant,  but  should  be  kept  clear  and  understandable 

•  Implementation  level  efficiency  should  be  kept  separated  and  hidden 

•  Limidng  the  potentially  vast  search  sp^  by  meu-knowledge 

•  Preventing  combinatorial  explosion  by  appropriate  techniques 
Knowledge  engineering  is  difficult 

•  Knowledge  acquisition  requires  experience,  courage  and  understanding 

•  Knowledge  representation  should  be  general  and  flexible 

•  Run-time  addition  of  knowledge  requires  strong  knowlegde  management 
Distance  from  traditional  support  systems 

•  Existing  models,  problems,  algorithms  cannot  be  directly  ‘plugged-in* 

•  Expens  using  more  traditional  approaches  need  time/understanding 

•  It  is  quite  difficult  to  accept  conceptual  suppon  from  a  computer 

8.  COMBINING  THE  TWO  APPROACHES 

Extensions  of  optimization  problems  and  algorithms  by  logic  programming 

The  mathematical  programming  approach  might  benefit  from  a  logic  programming  extention  and  a 
knowledge  base  in  the  following  areas: 

•  Naniral  language  interface  between  the  expen  and  the  suppon  system 

•  Man  -  machine  interactions,  ‘collective’  problem  solving 

•  Problem  transformations 

•  Pre-  and  postprocessing  of  data,  requirements,  criteria  and  solutions 

•  Reasoning  outside  the  scope  of  the  mathematical  model 

•  Supporting  formulation  of  problems  from  intentions 

•  Selecting  and  combining  existing  algorithms  for  well-defined  problems 

•  Finding  contradictions  and  resolving  conflicts 

•  Using  a  knowledge  base  to  improve  algorithm  performances 

•  Development  and  use  of  generic  algorithms 

•  Generation  and  experimentation  with  various  heuristics 

Extensions  of  logic  programming  by  mathematicaJ  programming  techniques 

A  logic  programming  based  decision  suppon  system  might  benefit  from  optimization  techniques 
and  nxxlels  in  several  different  ways 

•  Traditional  nvxiels  and  algorithms  as  internal  utilities 

•  Relaxations  and  estimates  for  pruning  the  search  tree 

•  Grouping  and  ordering  the  set  of  solutions  e.g.  by  lexicographic  methods 

•  Repetition  free  generation  of  all  solutions 

•  Sampling  techniques  in  case  of  very  many  solutions 

•  Selection  from  alternative  heuristics 

•  Reorganizing  the  search  space 

•  Partial  ordering  techniques 
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9.  CONCLUSION 

The  mathematical  programming  and  the  logic  programming  based  approaches  to  modelling  and 
problem  solving  in  order  to  support  decision  processes,  are  quite  diluent  Serious  efforts  were  made 
for  deeper  integration  of  mathematical  programming  into  logic  programming.  One  of  them  is  the 
definition  and  implementation  of  Prolog  in,  in  which  linear  programming  with  a  flexible  qiplication 
of  the  simplex  method,  is  part  of  the  language  and  is  readily  available  as  part  of  the  declarative 
programming.  Other  constraint  sadsfacdon  approaches  are  also  being  used  for  logic  programming.  In 
constraint  sadsfacdon  the  actual  numerical  problem  solving  is  delayed  if  numeric  data  are  not  yet 
available  or  the  selecdon  is  not  unique.  In  such  a  way  the  advantages  of  declaradve  programming  can 
be  preserved.  The  suggesdons  indicated  in  secdon  8  indicate,that  there  are  many  more  reasons  for 
combining  the  two  approaches,  the  work  has  only  begun. 


Modelling  Tools  for  Reformulating  Logical  Forms  into  Zero-One  Mixed 
Integer  Programs 

Bjami  KrLstjansson^.Cormac  Lucas’^’*',  Gautam  Mitra*  and  Shirley  Moody* 

*  Maximal  Software  Ltd.,  Klapparas  11,  15-110  Reykjavik,  Iceland 

**  Department  of  Mathematics  and  Statistics,  Brunei  University,  Uxbridge, 
Middx.  UBS  3PH.  UK 

Introduction 

Computer  based  languages  for  constructing  and  analysing  .Mathematicai 
Programming  models  have  been  investigated  over  the  last  two  decades. 
There  are  many  experimental  and  commercial  systems  currently  available 
which  provide  modelling  support.  Most  modern  modelling  systems  enable 
the  modeller  to  specify  models  in  a  declarative  algebraic  language.  A  set  of 
algebraic  statements  in  a  modelling  language  both  specifies  and  documents  a 
model,  whereas  the  generation  of  a  constraint  matrbc  takes  place  in  the 
background. 

Although  some  modelling  systems  have  been  extended  to  incorporate 
non-linearities  and  to  help  with  a  greater  variety  of  discrete  optimisation 
problems,  very  little  attention  has  been  given  to  the  modelling  of  discrete 
programming  extensions  of  LP  problems.  Many  Mathematical  Programming 
problems  involve  logical  restrictions  which  may  be  expressed  relatively  easily 
using  propositional  calculus,  but  the  reformulation  of  such  statements  into 
Mixed  Integer  Programs  (MIPs)  is  conceptually  difficult.  This  reformulation 
may  be  carried  out  systematically,  but  as  yet  there  is  no  computer  support 
for  this  task  within  a  Mathematical  Programming  modelling  system. 
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We  present  a  reformulation  procedure  for  transforming 
statements  in  propositional  logic  into  integer  or  mixed  integer  programs; 
this  procedure  makes  novel  use  of  the  Reverse  Polish  notation  and  the 
resulting  expression  tree.  We  define  a  new  syntax  involving  logical 
propositions  and  operators  whereby  the  structure  of  an  LP  modelling 
language  is  extended.  This  method  is  particularly  suitable  as  a  modelling 
technique  which  allows  one  to  automate  the  reformulation  process  to 
construct  equivalent  IP  or  MIP  models.  The  final  goal  is  to  integrate  this 
modelling  function  into  an  "intelligent"  mathematical  programming  modelling 
support  system. 

Logic  Forms  Represented  by  0-1  Variables 

The  main  task  of  reformulation  is  to  transform  a  compound  proposition  into 
a  system  of  linear  constraints  so  that  the  logical  equivalence  of  the 
transformed  expressions  is  maintained. 

In  order  to  explain  the  reformulation  process  and  the  underlying  principles 
more  clearly,  two  cases  are  distinguished  namely,  connecting  logical  variables 
and  logically  relating  linear  form  constraints. 

Let  Pj  denote  the  jth  logical  variable  which  takes  values  TRUE  or 
FALSE  and  represents  an  atomic  proposition  describing  an  action,  option  or 
decision.  Associate  an  integer  variable  with  each  type  of  action  (or  option). 
This  variable,  known  as  the  binary  decision  variable,  is  denoted  by  "dj”  and 
can  take  only  the  values  1  and  0  (binary).  The  connection  of  these  variables 
to  the  propositions  are  defined  by  the  following  relations; 

<Jj  =  I  iff  proposition  Pj  is  TRUE 
<jj  =  0  iff  proposition  Pj  is  FALSE 

Imposition  of  logical  conditions  linking  the  different  actions  in  a  model  is 
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achieved  by  expressing  these  conditions  in  the  form  of  linear  constraints 
connecting  the  associated  decision  variables. 

Using  Propositional  Calculus,  a  list  of  standard  form  "variable 
transformations"  are  defined.  ’  These  transformations  are  applied  to 
compound  propositions  involving  one  or  more  atomic  propositions  Pj, 
whereby  the  compound  propositions  are  restated  in  linear  algebraic  forms 
involving  decision  variables. 

Ek>und  Analysis/Logically  Relating  Linear  Form  Constraints 
Consider  the  linear  form  restriction 

D 

:  X)  ^kj  1  ^  1 
J-i 

where  p  defines  the  type  of  mathematical  relation,  p  e  {<•>,  =  } .  Let  Lj^, 
Ujj,  denote  the  lower  and  upper  bounds,  respectively,  on  the  corresponding 
linear  form,  that  is 

n 

^k  ^  2  ^kj  ■  ^k  ^  ^k  • 

J-i 


Finite  bounds  Lj^  and  Uj^  are  used  in  the  reformulation  procedure.  These 
bounds  may  be  given  or,  alternatively,  can  be  computed  for  finite  ranges  of 
Xj.  For  example,  if  <  xj  <  uj  (j  =  l,...,n)  then 

Lk  =  E  akjfj  +  E  akjUj  -  bk  and  Uk  =  E  akjUj  +  E  akjfj  -  bk 

jePk  j€Nk  j6Pk  jeN^ 

where  Pk  =  U  ^  akj  >  0)  and  Nk  =  {j  •  akj  <  O}- 


A  "Logical  Constraint  in  the  Implication  Form"  (LCIF)  is  a  logical 
combination  of  simple  constraints  and  is  defined  as 
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If  antecedent  then  consequent 

where  the  antecedent  is  a  logical  variable  and  the  consequent  is  a  linear  form 
constraint. 

A  "logical  constraint  m  the  general  form"  can  be  always  reduced  to  an 
LCIF  using  standard  transformations.  To  model  the  LCIF,  a  0-1  indicator 
variable  is  linked  to  the  antecedent.  Whether  the  linear  form  constraint  LFj^ 
applies  or  otherwise  is  indicated  by  a  0- 1  variable  . 

=  1  iff  the  kth  linear  restriction  applies 

=  0  iff  the  kth  linear  restriction  does  not  apply 

A  jet  constraint  transformations  are  defined  which  illustrate  how 
this  binarv  variable,  namely  the  indicator  variable  of  the  antecedent,  using 
the  bound  value  .tlates  to  I'le  linear  form  restriction,  that  is  the  consequent. 

Polish  Notation  and  Expression  1'rces 

ITsing  the  normal  precedence  operators  and  the  conventional  evaluation  of 
expressions  the  following  logical  form 

PvQv-RaS 

would  be  written  as 

((P  V  Q)  V  ((-  R)  A  S)). 

Not  using  brackets  as  above  but  simply  placing  the  operator  symbols 
at  the  nodes,  one  can  build  up  a  tree  representation  using  Polish  notation. 
Choice  of  the  directions  in  which  the  variables  and  symbols  are  scanned 
leads  to  two  well  known  variations,  namely,  forward  (right  to  left  scan)  or 
reverse  (left  to  right  scan)  Polish  notation.  The  Polish  notation  for  an 
expression  is  not  unique  and  within  forward  Polish,  for  instance, 
early-operator  form  or  late-operator  form  leads  to  two  different  notations  and 
oorre-sponds  to  inserting  Church’s  brackets  from  the  left  or  from  the  right 
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respectively.  The  given  expression  can  be  written  as 

((P  V  Q)  V  «-  R)  A  S)). 
or 

(P  V  (Q  V  (~  R  A  S))). 

The  tree  representation  for  the  first  of  these  expression  is  shown  below. 

((  PVQ  )V(-RAS  )) 


The  Algorithm 

The  reformulation  of  a  logical  statement  into  inequalities  is  not  unique;  in 
fact  as  a  result  of  the  many,  but  equivalent,  forms  any  logical  statement  can 
take,  there  are  often  different  ways  of  generating  the  same  or  equivalent 
mathematical  reformulations. 

One  possible  way  would  be  to  convert  the  desired  expression  into  a 
normal  form  such  as  the  conjunction  of  disjunctive  terms  into  the 
corresponding  clauses.  Each  clause  is  then  transformed  into  a  linear 
constraint  so  that  the  resulting  conjunctive  normal  form  can  be  represented 
by  a  system  of  constraints  which  have  to  be  satisfied  invoking  the  logical 
"and"  operation. 

In  the  absence  of  a  systematic  approach,  the  above  process  appears  to 
be  unduly  complicated.  This  has  motivated  us  to  propose  a  systematic 


342 


procedure  to  reformulate  a  logical  condition  imposed  on  a  model  into  a  set  of 
integer  linear  constraints.  Our  approach,  in  essence,  involves  identifying  a 
precise  compound  statement  of  the  problem  and  then  processing  this 
statement.  This  compound  statement  (S)  is  represented  as  an  extended 
expression  tree  by  the  Polish  notation  and  two  working  stack  mechanisms, 
namely  VSTACK  for  variables  and  CSTACK  for  constraints  are  created.  The 
expression  tree  is  traversed,  that  is,  the  expression  is  analysed  and 
constraints  are  created  in  CSTACK  using  variables  which  are  introduced  in 
VSTACK.  The  steps  of  the  procedure  which  fully  processes  and  resolves  the 
tree  are  described  in  our  presentation. 

Outline  of  a  Prototype  System 

The  reformulation  procedure  thus  described  is  illustrated  by  means  of  an 
example.  Consider  the  following  problem: 

In  order  to  satisfy  a  country’s  energy  demands,  it  is  possible  to  import 
coal,  gas  and  nuclear  fuel  from  three  neighbouring  countries.  There  are  three 
grades  of  coal  and  gas  (low,  medium  and  high)  and  one  grade  of  nuclear  fuel 
which  may  be  imported. 

The  import  costs  for  each  fuel  (in  £s  per  gigajoule  of  energy  obtained) 
are  provided  together  with  upper  and  lower  limits  on  the  fuel  supplied  by 
each  country.  The  problem  is  to  decide  what  quantities  of  each  fuel  should 
be  imported  from  each  country  so  that  the  total  import  cost  is  minimized  and 
the  country's  energy  requirements  are  met. 

In  addition,  there  are  the  following  logical  conditions  which  must  also 
be  satisfied: 

(i)  Each  country  can  supply  either  up  to  three  non-nuclear,  low  or  medium 
grade  fuels  or  nuclear  fuel  and  one  high  grade  fuel. 
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(ii)  Environmental  regulations  require  that  nuclear  fuel  can  be  used  only  if 
medium  and  low  grades  of  gas  and  coal  are  excluded. 

(iii)  If  gas  is  imported  then  either  the  amount  of  gas  energy  imported  must 
lie  between  40  -  50%  and  the  amount  of  coal  energy  must  be  between  20  - 

30%  of  the  total  energy  imported  or  the  quantity  of  gas  energy  must  lie 
between  50  -  60%  and  coal  is  not  imported. 

The  model  specification  for  this  problem  including  the  reformulation  of  the 
logical  conditions  is  detailed  in  our  presentation.  An  extended  syntax  is  also 
described  which  illustrates  how  reformulation  support  for  logical  statements 
may  be  incorporated  into  the  modelling  system  MPL  (Maximal  Software). 

Concluding  Remarks 

The  zero-one  mbced  integer  programming  formulation  of  logical  conditions 
presented  as  propositional  calculus  statements  is  an  important  research  topic 
in  discrete  modelling.  In  our  work  the  established  syntax  of  representing  LP 
models  in  an  algebraic  form  is  extended  to  incorporate  logical  restrictions  set 
out  as  propositional  calculus  statements.  The  methods  described  do  not 
necessarily  achieve  the  most  computationally  efficient  model  after 
reformulation.  Our  main  aim  has  been  to  reduce  the  chore  for  an  experienced 
analyst,  and  also  to  provide  support  for  a  problem  owner  who  is  capable  of 
describing  his  problem  but  may  not  be  experienced  in  reformulation 
techniques.  A  system  constructed  in  this  way  not  only  provides  discrete 
modelling  support  but  can  also  be  used  as  a  teaching  aid  to  new  modellers  in 
MIP  reformulation  techniques. 
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ABSTRACT 


Madrid  and  Ramsay  Models  for  Slovak  Economy. 

Adam  LaSCiak,  University  of  Economics, 
Bratislava,  OdbojArov  10. 


The  Madrid  s  Model  (3)  the  basis  of  which  is  Pontriagin  s 
approach  on  the  basis  of  (  eynesian  economy  type  was  adapted  and 
solved  for  the  Slovat  economy  with  the  help  of  NLMOS  software. 

Ramsay  s  model  (4)  is  the  mathematical  and  economic 
application  for  optimal  repartition  ofinvestment  and  consumption 
as  a  framework  of  national  income  growth  in  Slovakia.  GAMS 


software  presents  special  possi 
problem  with  MINOS  adaptation. 

We  have  used  the  followin 
(proposed  by  Me  Fadden  in  1V69, 

General  accounting  identities. 

Y  (  k  )  =  C  (  k  >  +  S  (  k  )  +  T  <  k  ) 

X ( k )  =  C ( k  )  +  I <  k ) +k  <  k ) +G ( k  > 

B  <  k  )  =  E  -  M  a.  )  -  k:  <  k  ) 

R  <  k  >  =  Y  (  k  )  -  X  (  k  ) 

T  (  k  )  =  Y  (  k  )  -  C  <  k  )  -  S  (  k  ) 


lility  to  solve  this  variational 

I  adaptation  of  Madrid  s  model 
see  (5))  for  Slovak  economy. 

Adaptation  for  Slovak  economy. 
GNPB  =  -26738P.3  +  0.876  GNPA 
+  T  +  C 

X  =  -5738.8  +  0.497  HND  +  G  +  C 
+  12408.9  RIN 

B  =  45739.5  -  0.286  GNPB  +  E 
-  1171.8  REM  -12408.9  RIN 
R  =  GNPB  -  X 
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Behavioral  equations. 


3  <  k  /  = 

-ao 

+ 

a  1  Y  (  k  ) 

,  ai> 

0 

h(k  )  = 

-bo 

-♦* 

bi Y<k  ) 

.  bi> 

0 

II 

be 

- 

d  1  r  (  k  ) 

. 

0 

K<k)  = 

Co 

- 

c  1  r  (  k  ) 

0 

Cl  y)  ^ 

(Do 

+ 

mi  Y  (  k  ) 

.  mi  7 

0 
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Control  variables. 

:  dr  < 1  )  : 

u ; k )  =  :  ;  ,  where 

:  dD( 1 ;  ; 


Aggregate  saving 
Imports  of  goods  and  services 
Domestic  gross  investment 
Net  Capital  outflows 
Aggregate  private  consumption 
Exports  (exogenous^ 

d  r  (  1  )  =  r  (  1  +  1  )  -  r  (  1  ) 

d  Cm  1  (  =  G  1  )  -  T  U  ; 


Symbols  and  parameters . 


r  <  k  ) 


rf 


V  ( K  ) 
X  '  1  ) 
GO  ' 
T(1  ; 
GNPA 
HND 
REn 
RIN 
B  (  k  ) 
R  (  i  ) 


GNPB 

X 


-  V  I  k  I  +  S  <  ^  ) 


h 

R 


.  Domesti',;  '•.ciii  1 1  ta  1  interesi-  rate 

or  foreign  interest  r^te 

.  Domestic  production 

.  Aggregate  expenditure 

.  Aggregate  public  consumption 

•  •  •  •  •  T  ^ 

.  L'omestic  production  and  saving 

.  Gross  national  income 

.  Change  rate  for  USD 

.  Non-iinal  domestic  interest  rate 

.  Net  surplus  in  RayiTients  Balance 

.  Balance  betweeti  Ressources  and 

Expendi ture 
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The  quarterly  statistical  indicators  were  quoted  from  the 
period  1989Q1  -  1991Q4  and  the  solution  of  the  model  variables 
are  given  for  the  period  of  1992Q1  -  1993Q4. 

The  model  was  controlled  by  taxes,  public  consumption  and 
nominal  interest  rate.  The  "goal  of  the  mooel  was  maMimal ir ation 
of  the  GNPB. 

The  solution  of  the  model  indicates  that  tne  tax  volume  for 
Slovakia  was  at  the  limits  of  possibility  Slovak  economy ,  the 
interest  rate  was  not  significant  as  a  parameter  of  control. The 
relation  for  Net  surplus  in  balance  of  payments  can  be  solved 
only  as  an  approximate  goal. 

All  statistical  indicators,  derivated  parameters,  given 
relations  of  model  were  tested  on  the  basis  of  the  SORITEC  soft¬ 
ware.  The  model  was  calculated  also  for  the  given  periods  1989Q1- 
199104.  In  this  case  at  is  possible  to  use  the  Optimal  Control 
Theory  results  to  calculate  the  control  that  minimizes  the 
deviations  from  the  desired  control  trajectory  and  to  make  an 
analysis  of  the  past  period. 

The  basic  core  of  the  model  may  be  described  in  the 
following  form: 


X  <  k  +  1  ) 

:  :  I  : 

t>  ;  :  ;<  <  k  )  ! 

:  B  : 

.  =  - - - 

- :  :  :  + 

:  -  !  LU  k  > 

y ( k  +  1  ) 

:  :  I  : 

0  :  :  y  (  k  )  : 

:  (;>  ! 

:  X  (  k  )  : 

y  ( k  ) 

=  :  0  : 

I  :  :  : 

:  y  (  k  )  : 

;dr (k ) ; 

is  the  control 

vec  tor , where 

u  <  k  ) 

5S  •  • 

dr(k )  =  r(k+l ) 

-  r<k)  15  a  monetary 

:dD<k):  measure  and  dD(k>  =  D(k+1)  -  D(k), 
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D<k)  being  the  domestic  deficit  and 
D ( k )  =  G ( k )  -  T ( k )  is  a  fiscal 

measure. 


:  B<kk  : 

x(k)  =  ;  :  IS  the  state  vector. 

:  y  (  k  >  : 


B 


ci+  n  bidi  -  n  bi  :  is  input  distribution 

:  matrix 

-n  di  n  :  with  n  =  l/(ai+b2) 


If  we  are  interesteds  in 
system  < see  detail  in  < 3 > > : 


B ( k  +  1  )  ; 

B  a:  1  : 

:  = 

I  : 

y  ( k  + 1  )  : 

y  (  k  )  ; 

B  <  0  )  (  1 

Xo  <  k  ) 

y  ( <;> )  <  1 

the  open  loop  control  of  the 


d  r-f  (  k  ) 


B 

:  dLM,  k) 

: 

and 

eB  >* 

:  =  IS 

the 

desired 

:  trajectory  and 

;  ( '■) 

4 

eB  4  1  > 

(0 

ev  ) 

the  open  loop  control  should  have  the  form: 


uo  <  k 


B-i 


— B  (  U )  ee  1  1  —  ee  1 " 


y  ( O )  ey  ( 1  +  ev  ) 
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and  transforming  xo(k)  into  dynamic  equilibrium  path.  But  for 
the  Slovak  economy  case  it  is  possible  only  with  some 
approximation  and  deviation  from  the  goals.  The  dynamic  equi¬ 
librium  path  for  Slovak  economy  in  the  next  future  is  very  hard 
to  satisfy. 


The  Ramsay  model  <(4),(6))  modification  propsed  for  Slovak 
economy  has  the  fallowing  form  set: 

tr 


MAX  U  =  SUM  (btlogCt.) 


A  <  t )  Kt. 
U 
A 

A<  t) 


t=ti 

=  Ct  +  It.  -  National  receive 

^  A(1  •«-  ac>*-  -  Gross  investment 
=  Yq/(Ko)^  -  IS  the  initial  condition 

=  A<1  +  w)<i-e)t. 


Yt. 

= 

Ct.  +  It 

-  "National  income" 

Kt  +  i 

= 

Kt  +  It 

-  Capital  for  the  producing 

w  Kx. 

It 

sphere 

-"  w"  is  labour  growth  rate 

T 

and  the  initial  conditions 

T 

•  Ct  ^  Co  > 

It  —  lo  »  K.t  ko  . 

The  another  symbols:  U 

Cx. 

br 


e 

w 

ac 


Utility  function 

Aggregate  consumption  (for  all  inhabitans) 
Eiiscount  factor  (necessary  to  make  the 
parametrization  in  the  steps  of 
the  optimal isation ) 

Elasticy 

Labour  growth  rate 

The  mesure  for  labour  capacity  absorb- 
tion:  1/(K/L>,  L  -  labour  capacity. 
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Y(t)  =  F  (  expi  r ')<  OIK'LL  ‘  ‘  ’  )  is  the  production  function 

as  a  checking  function  for  National  receive  (inhabitans  and 
enterprises)  when  the  problem  of  taxes  is  outside  the  Ramsay  s  model. 
<For  the  Taxes  problem  see  the  Madrid  model.) 

All  parameters  ana  indicators  were  derivated  by  SORITEC 
software.  The  adaptation  of  GAMS  software  with  MINOS  algorithm 
was  successful. 

Optimal  solution  of  repartition  of  "National  income"  for 
Slovak  economy  was  the  following: 


<Mld  kcs) 

NI 

Consumption 

Investment 

Capital  St' 

1 990 

1  6  3  .  P 

1<;>7.  If 

56.  6 

1 163. 6 

1995 

169.8 

112.  9 

56.8 

1446.6 

200<.'i 

1  75.  S 

119.2 

56. 8 

1741.2 

(1) 

1 990 

185.9 

123.  1 

62.8 

1 163.6 

1995 

193.  3 

130.5 

62.8 

1477.9 

2<Ii0t> 

200 . 5 

1  37.  7 

62.6 

1804. 9 

(i)  The  first  table  has  the  statistical  classification  according  the 
old  calculatiom  for  Material  product.  The  second  -  in 
accordance  with  the  Gross  Domestic  Product. 

The  Madrid  and  Ramsay  models  are  only  one  part  of  the 
system  models  wnich  are  described  as  succesive  approch  to  market 
mechanism  in  Slovak  economy.  The  key  model  for  the  period  of  tra- 
sition  is  MODELER  -  the  model  on  the  basis  of  linear  and 
nonlinear  econometric  equations  (set  of  75  equations)  which  is 
outside  of  this  paper. 

The  possibility  of  economic  growth  is  described  in  the  pe- 
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riod  1989D1  -  1991Q4  also  by  means  of  extensive  and  intensive 
factors  which  have  been  developed  into  a  model  through  dynamira- 
tion  of  different  production  functions  on  the  Hamilton  s  princi¬ 
ple  of  the  least  change  of  the  economic  state.  For  the  mentioned 
period  we  notice  a  decrease '-of  the  total  economical  growth  dyna¬ 
mics  as  well  as  the  influence  share  of  the  intensive  factors. 

These  mentioned  mostly  long-term  tendencies  follow  also 
from  the  results  of  using  other  analytical  tools  as  the  IS  -  LM 
model.  But  with  application  of  the  IS  -  LM  model  were  some  pro¬ 
blems  because  at  present  we  are  in  a  period  of  transition  from 
central  to  market  economy.  Moreover  it  is  very  cifficult  to 
obtain  and  to  precise  the  inaicators  of  irioney  -  market  now.  For 
these  reasons  it  is  very  difficult  to  achieve  one  equilibrium 
among  the  goods  and  sevices  and  the  money  -  market. 

For  the  keynesian  or  classical  monetary  approach  in  our  imo- 
delling  the  way  to  marl-et  mechamism  there  are  following  reasons: 
The  difference  lies  in  the  assumption  of  classical  macroeconomics 
that,  prices  and  wages  are  flexible  -  trussting  that  after  an 
economic  shock,  the  price  flexibilty  can  restore  full  employment 
very  fast.  That  s  not  the  case  for  easterh  countries,  neither  it 
IS  Slovakia.  In  fact,  the  keynesian  revolution  combined  two  ele¬ 
ments.  First,  It  is  the  concept  of  agragate  demand,  in  which 
agragate  spending  would  be  driven  by  the  consumption  function 
and  by  investment  decisions.  Second,  the  price  flexibility  in 
the  transition  period  is  only  in  a  one-way  one  and  wages  are 
under  consensus  of  discrete  inflexibility  or  sticky.  (See  Paul 
Samuleson  -  William  Nordhaus  in  (7>, 
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Abstract 

The  use  of  interior  point  method  (IPM)  based  optimizer  as  a  robust  linear 
programming  (LP)  solver  is  now  well  established.  The  default  parameter 
settings  of  new  generation  of  IPM  solvers  are  sufficient  to  process  a  wide 
class  of  LP  problems  whereas  traditional  sparse  simplex  (SSX)  based  solvers 
may  require  a  considerable  adaptation  and  tuning  from  one  model  class  to 
another.  The  progress  of  IPM  iterations  is  not  hindered  by  the  degeneracy  or 
the  stalling  problem  of  SSX,  indeed  it  reaches  the  ’near  optimum’  solution 
very  quickly.  The  SSX  algorithm,  in  contrast,  is  not  affected  by  the  boundary 
conditions  which  slow  down  the  convergence  of  IPM. 

The  extreme  point  solution  is  the  comer  stone  of  the  SSX  algorithm  and  use 
of  the  corresponding  optimal  LP  basis  as  a  starting  point  for  solving  integer 
programming  problems  or  post  optimal  analysis  amongst  others  is  well  known. 
The  IPM  algorithms  usually  converge  to  a  point  in  the  interior  of  the  optimal 
face  (non-extreme  point)  and  their  poor  behaviour  near  the  boundary  make  it 
difficult  to  apply  IPM  in  the  same  way  as  the  SSX. 

To  take  advantage  of  the  attractive  properties  of  IPM  and  SSX,  we  have 
designed  a  hybrid  framework  whereby  cross  over  from  IPM  to  SSX  can  take 
place  at  any  stage  of  the  IPM  optimisation  run.  The  cross  over  to  SSX,  at  a 
non  optimal  solution,  involves  the  prediction  of  the  optimal  face.  Our 
prediction  incorporates  several  methods  suggested  by  us  and  other  researchers 
in  this  area. 

We  review  a  number  of  cross  over  methods  and  test  their  suitability  for 
optimal  and  intermediate  cross  over.  Some  computational  results  on  a  set  of 
degenerate  and  non  degenerate  test  problems  are  given. 
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1.  Introduction 

t 

The  use  of  IPM  for  the  solution  of  linear  programs  provides  a  number  of  benefits  which  are 
summarized  below.  For  large  or  highly  degenerate  LPs  IPM  is  usually  faster  than  the  SSX 
solver.  Whereas  SSX  based  algtmthms  require  considerable  adaptation  and  control  parameter 
tuning  from  one  model  class  to  another,  default  settings  of  IPM  are  sufficient  to  process  a 
wide  class  of  LPs.  IPM  is  not  only  robust  in  this  way,  its  progress  is  not  hindered  by  the 
degeneracy  or  the  stalling  problem  of  the  SSX;  indeed  it  reaches  the  "near  optimal”  solution 
very  quickly.  SSX  algorithms,  in  contrast,  are  not  affected  by  the  boundary  conditions  which 
slow  down  the  convergence  of  IPM. 

There  are  three  well  known  and  well  exercised  extensions  of  traditional  LP  namely, 
successive  linear  programming,  integer  programming,  and  post-optimal  analysis.  In  all  these 
cases  optimum  solutions  for  a  family  of  problems  need  to  be  computed  which  in  turn 
involves  reoptimization  from  the  last  computed  primal  and  dual  optimum  basis  (extreme  point 
solution).  The  extreme  solution  is  the  comer  stone  of  SSX  algorithms  and  the  use  of  the 
corresponding  basis  as  starting  point  is  naturally  applied  in  this  context  to  solve  efficiently 
a  family  of  similar  problems.  IPM  algorithms,  on  the  other  hand,  usually  converge  to  a 
point  in  the  interior  of  the  optimal  face.  This  property  and  the  behaviour  of  the  IPM 
algonthm  near  the  boundary  (Megiddo  89)  make  it  difficult  to  apply  IPM  in  the  same  way 
for  a  family  of  similar  LP’s.  Some  researchers  have  attempted  to  develop  new  theory  and 
methods  that  do  not  depend  on  the  extreme  point  representation  (Adler  89)  (Guler  92). 
Alternatively,  the  fast  initial  convergence  of  IPM  to  a  near  optimal  solution  can  be  followed 
up  by  the  superior  near  optimal  to  optimal  convergence  of  SSX  algorithms.  This  strategy 
is  not  only  computationally  attractive  it  also  provides  the  (currently)  desirable  extreme  point 
solution  and  the  corresponding  primal  and  dual  optimum  basis.  Most  researchers  (  see  for 
example  (Megiddo  88,91),  (Bixby  92),  (Mitra  88)  )  consider  this  latter  approach  to  be  a 
promising  computational  scenario.  This  hybrid  approach,  however,  requires  a  substantial 
performance  superiority  of  IPM  and  an  efficient  IPM-SSX  integration  to  make  it  worthwhile. 

The  research  issues  reported  in  this  paper  are  mainly  concerned  with  extending  IPM  whereby 
it  can  be  brought  into  the  mainstream  of  solving  large  LP’s.  After  this  introduction,  in 
section  2  we  review  and  extend  methods  for  predicting  the  variables  that  are  active  in  the 
IPM  solution  before  optimality  is  reached.  In  section  3  we  describe  methods  for  converting 
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the  IPM  optimal  or  predicted  solution  to  an  extreme  point  solution.  In  Section  4  we  present 
experiments  in  early  and  optimal  cross  over  from  IPM  to  SSX  using  alternative  prediction 
criteria.  Our  conclusions  are  summarized  in  section  5. 

2.  Prediction  of  variables  in  an  optimal  solution 

Consider  the  following  primal  and  dual  LP  problems: 

(Primal)  Min  c  S.T.  Ax~b,  x^0 

(Dual)  Max  b^y  S.T.  A^y+z’^,  z^O  (-.1) 

/leR""',  y,l»eR“ 

An  interior  point  algorithm  applied  to  these  problems  generates  a  sequence  of  strictly  interior 
points  and  theoretically  converges  to  an  optimal  solution  which  is  a  boundary  point  of  the 
feasible  polyhedron.  The  actual  termination  of  the  algorithm,  however,  is  not  on  the 
boundary  of  the  polyhedron  but  in  the  interior  (Levkovitz  92)  close  to  an  optimal  solution. 
At  the  optimal  solution,  strict  complementarity  is  enforced,  that  is  : 
c^x-b^y=xh=0  and  XZe=0  (2.2) 

We  define  the  indicator  sets  of  active  (positive)  and  dormant  (non  positive)  indices  o(v),  o(v) 
of  a  non  negative  vector  v  as: 

veR",  viO  ,  a(v)=(i,v,>OJ.  o(v)={ -o(v)  (2.3) 

Of  all  the  primal  optimal  solutions  and  the  dual  optimal  solutions  of  (2. 1)  there  exist  at  least 
one  solution  pair  (x*),  (y'x’)  where  the  strict  complementarity  of  (2.2)  applies,  or  in  terms 
of  the  indicator  set  o  defined  in  (2.3): 

(0  o(x*)no(z*)=4>  and  (ii)  o(x')Uo(z*)=(l,..,n) 

(2.4) 

We  call  this  solution  the  strict  complementary  solution.  We  note  that  the  second  condition 
of  (2.4)  does  not  hold  if  in  both  the  primal  and  the  dual  problems  are  degenerate.  This  case 
is  considered  in  section  3. 

Guler  and  Ye  (91)  show  that  a  class  of  interior  point  methods  which  also  contains  the  primal 
dual  predictor  corrector  algorithm  (Mehrotra  90)  generates  a  sequence  of  feasible  pairs 
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the  IPM  optimal  or  predicted  solution  to  an  extreme  point  solution.  In  Section  4  we  present 
experiments  in  early  and  optimal  cross  over  from  IPM  to  SSX  using  alternative  prediction 
criteria.  Our  conclusions  are  summarized  in  section  S. 

2.  Prediction  of  variables  in  an  optimal  solution 

Consider  the  following  primal  and  dual  LP  problems: 

(Primai)  Min  c  S.T.  Ax=b,  xiO 

<Dunr>  Max  b^y  S.T.  A^y*z=c,  ziO  (2.1) 

yteR”"",  x,z,cgR",  y,h6R" 

An  interior  point  algorithm  applied  to  these  problems  generates  a  sequence  of  strictly  interior 
points  and  theoretically  converges  to  an  optimal  solution  which  is  a  boundary  point  of  the 
feasible  polyhedron.  The  actual  termination  of  the  algorithm,  however,  is  not  on  the 
boundary  of  the  polyhedron  but  in  the  interior  (Levkovitz  92)  close  to  an  optimal  solution. 
At  the  optimal  solution,  strict  complementarity  is  enforced,  that  is : 
c^x-b^y=x^z=0  and  XZe=0  (2.2) 

We  define  the  indicator  sets  of  active  (positive)  and  dormant  (non  positive)  indices  o(v),  o(v) 
of  a  non  negative  vector  v  as: 

vgR",  V20  ,  a(v)  =  (i,v,>0|,  o(v)={l,../i) -o(v)  (2.3) 

Of  all  the  primal  optimal  solutions  and  the  dual  optimal  solutions  of  (2. 1)  there  exist  at  least 
one  solution  pair  (x*),  (y*,z’)  where  the  strict  complementarity  of  (2.2)  applies,  or  in  terms 
of  the  indicator  set  o  defined  in  (2.3): 

(0  o(x  ')no(z  *)  =4>  and  (ii)  a(x  *)Uo(z  *) = { 1  ,..,n )  (2.4) 

We  call  this  solution  the  strict  complementary  solution.  We  note  that  the  second  condition 
of  (2.4)  does  not  hold  if  in  both  the  primal  and  the  dual  problems  are  degenerate.  This  case 
is  considered  in  section  3. 

Guler  and  Ye  (91)  show  that  a  class  of  interior  point  methods  which  also  contains  the  primal 
dual  predictor  corrector  algorithm  (Mehrotra  90)  generates  a  sequence  of  feasible  pairs 
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(**  z*)  such  that : 
x*r* 

For  this  class  of  algorithms,  they  also  proved  the  following  theorem: 


(2.5) 


Theorem  2.1:  At  iteration  k,  let  and  assume  the  LP  data  is  rational.  If  Lis 

the  input  length  of  the  problem  then  for  all  algorithms  that  satisfy  (2.5)  if 


Theorem  2. 1  shows  that  when  an  interior  point  is  close  enough  to  the  optimal  solution,  the 


set  o  *  of  active  variable  indices  can  be  identified.  The  dormant  variables  can  be  set  to  zero 
and  a  smaller  problem  can  be  solved  to  retrieve  the  ’exact’  optimal  solution  on  a  boundary 
of  the  polyhedron. 

Although  theorem  2.1  gives  a  theoretical  stopping  criterion  for  the  primal  dual  IPM,  in 
practice,  a  more  realistic  criterion  is  needed.  Mehrotra  (91-2)  proves  that  whenever  the 


set  o*  can  be  identified,  the  algorithm  can  terminate  the  normal  execution  and  perform  the 
following  computation  to  retrieve  the  primal  and  dual  optimal  solutions: 

(Primal) 


(Dual)  i4j'.Ay-c,.-y4,’!y*-z*. 


(2.6) 


These  equations  can  be  solved  in  a  single  TPM-like’  iteration. 

There  are  some  obvious  advantages  in  terminating  IPM  before  reaching  the  close 
neighbourhood  of  the  optimal  solution  and  by  applying  a  procedure  similar  to  (2.6).  Some 
of  these  advantages  are  listed  below: 


(i)  at  an  early  stage  of  the  IPM  search  procedure,  namely  50%  of  the  total  number  of 
iterations,  IPM  finds  a  near  optimum  solution  which  is  often  within  80-90%  of  the  final 
solution. 

(ii)  the  numerical  stability  of  the  algorithm  deteriorates  when  the  algorithm  gets  close  to  the 
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boundary,  the  computation  of  the  trajectory  in  the  iteration  of  IPMABC  includes  the 
computation  of  the  diagonal  matrix  D*=X\Z*)'' .  The  diagonal  of  this  matrix  can  have  some 
very  high  value  entries  for  variables  that  participate  in  the  solution  and  some  near  zero 
entries  for  variables  that  converge  to  their  lower  bound.  In  such  a  case,  the  matrix  AD’^A  ^ 
can  become  ill  conditioned  and  cause  numerical  errors. 

(iii)  As  active  or  dormant  variables  are  identified,  the  size  of  the  problem  can  be  reduced  by 
removing  them.  In  particular,  the  removal  of  dormant  variables  maintains  primal  feasibility 
and  reduces  the  computation  work  of  every  iteration.  Also,  the  removal  of  primal  variables 
that  converge  to  zero  increases  the  numerical  stability  of  the  calculation. 

As  a  result,  there  is  considerable  interest  in  identifying  the  active  and  dormant  variables  by 
heuristic  procedures  at  an  intermediate  stage  of  the  IPM  search.  Gay  (88),  Mehrotra  (91-2), 
Zhang  et  al.  (91),  El-Barki  et  al.  (91),  Levkovitz  (91)  and  others  have  put  forward  heuristics 
and  report  encouraging  results.  These  heuristics  are  based  on  indicator  functions  which  are 
calculated  at  every  iteration.  El-Barki  et  al.  in  their  survey  of  indicator  functions  classifies 
them  in  several  types  namely  variables  that  are  used  as  indicators,  primal  dual  indicators  and 
Tapia’s  indicators  which  are  also  related  to  the  primal  dual  indicators.  The  key  result  in  this 
survey  is  the  proof  that  the  convergence  rate  of  the  indicator  sets  to  their  optimal  value  is 
faster  than  the  convergence  rate  of  the  solution.  Thus,  the  solution  set  can  be  identified 
before  the  actual  solution  is  reached.  From  a  practical  point  of  view,  the  need  is  to  find 
some  criteria  that  give  a  sharp  separation  between  the  sets  of  active  and  dormant  variables. 
It  is  clear  that  the  sets  defined  in  Theorem  2.1  do  not  give  such  a  separation.  El-Barld’s 
survey  and  our  investigations  indicate  that  using  the  variable  values  as  indicators  suffers  from 
the  same  disadvantage:  in  the  NETLIB  problem  Pilot,  for  example,  some  variables  that 
participate  in  the  solution  have  very  low  values;  even  at  a  relatively  late  stage  of  the 
optimization  it  is  difficult  to  distinguish  between  them  and  the  variables  that  are  going  to 
their  bounds.  In  primal  only  or  dual  only  algorithms,  however,  the  indicators  are  usually 
based  on  such  a  criterion  (Gay  88). 

In  primal  dual  IPM  algorithms,  both  primal  and  dual  solution  vectors  are  computed  at  each 
iteration.  This  provides  an  opportunity  to  compute  indicators  based  on  the  primal  and  dual 
behaviour  throughout  the  iterative  process.  Mehrotra  (91-2),  Tapia,  El-Barki  (91)  and 
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others  have  proposed  the  following  indicator  function  ; 


(2.7) 


These  indicator  functions  are  applied  to  the  primal  and  dual  variables  to  create  the  indicator 
set  in  the  (k+ 1)*  iteration  of  IPM: 

o^=[j:h^x))i.bJiz))\  (2.8) 

Mehrotra  uses  this  set  o*  of  (2.8)  to  define  the  active  variables  of  (2.6). 

The  primal  dual  indicators  used  by  Gay,  Lustig  and  others  (Gay  88)  are  stated  as  follows: 


(2.9) 


2/ 


These  indicators  are  based  on  the  simple  observation  that  when  the  sequence  of  solution 
points  converges  to  the  optimal  solution,  the  indicator  for  the  active  variables  converges  to 
infinity  while  the  indicator  of  a  dormant  variable  converges  to  zero.  This  property  provides 
a  sharp  separation  between  the  two  sets  but  only  in  the  last  few  iterations  of  the  algorithm. 


For  our  investigations,  we  developed  a  different  kind  of  primal  dual  indicators.  These 
indicators  can  be  seen  as  a  combination  of  those  used  by  Tapia  and  those  used  by  Lustig. 
The  indicator  function  is  stated  below: 


6(2/ 


(2.10) 


Where 


Xf  +1  Zf  +1 


(2.11) 


These  indicator  functions  are  used  in  a  prediction  heuristic  incorporated  into  our  primal  dual 
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predictor  corrector  solver.  The  predicted  sets  of  active  and  dormant  variables  are  used  in 
cur  cross-over  procedure  whereby  the  predicted  solution  variable  set  is  used  to  fix  the  initial 
basis. 

3.  Cross  over  to  SSX  and  basis  recovery 

In  non  degenerate  LP  problems,  the  optimal  solution  point  set  is  restricted  to  a  single 
extreme  point  of  both  primal  and  dual  feasible  polyhedrons.  In  most  practical  cases, 
however,  the  LP  problem  is  primal  or  dual  degenerate  (or  both)  and  the  optimal  solution 
point  set  describes  a  face  of  the  feasible  polyhedrons.  It  is  well  known  that  the  optimal 
solution  generated  by  the  primal  dual  IPM  algorithm  converges  to  an  interior  point  of  this 
set  (Megiddo  89).  In  many  cases,  however,  an  optimal  extreme  point  solution  is  required. 
This  optimal  extreme  point  solution  and  the  corresponding  basis  provide  a  powerful 
representation  which  arises  in  LP  and  its  duality  theory.  Consider  the  LP  problem  of  (2. 1) 
and  assume  that  the  constraint  matrix  A  is  of  full  rank.  For  our  purposes  it  is  sufficient  to 
state  that  the  optimal  basis  of  this  LP  problem  is  a  submatrix  B,  BeR*”  such  that 
(x*)^=[(fl'‘fc)^0]4:*eR"  is  an  optimal  solution  of  the  primal  problem  and  y’=(B*)'‘c,  is  the 
optimal  solution  for  the  dual  problem.  A  primal  (non  degenerate)  basic  solution  requires 
exactly  m  primal  variables  to  be  active  and  their  corresponding  dual  slack  variables  to  be  0. 
Compared  to  optimal  SSX  solution  for  a  given  LP,  the  IPM  solution  has  more  active 
variables  if  the  problem  is  dual  degenerate  or  less  if  it  is  primal  degenerate.  In  the 
integration  of  IPM  and  SSX  a  basis  recovery  procedure  has  to  be  constructed  which  applies 
to  both  these  cases. 

There  are  essentially  three  related  problems  which  lie  at  the  heart  of  the  basis  recovery 
procedure  which  may  be  stated  as: 

(i)  given  a  primal  feasible  interior  point  solution  i ,  find  a  superior  basic  feasible  solution 
Xg,  such  that  CgXgic^x, 

(ii)  given  a  set  of  primal  or  dual  optimal  solution  values,  find  such  that: 

x;^  =  and  Af  ic 
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(iii)  given  a  set  of  primal  and  dual  optimal  solution  values,  find  as  in  (ii). 


We  note  that  if  IPM  stops  at  a  primal  feasible  but  suboptimal  interior  point  we  have  to  apply 
procedure  (i);  on  the  other  hand,  if  the  IPM  terminates  near  enough  to  the  optimal  solyuon 
then  solution  procedures  for  (ii)  or  (iii)  may  be  applied. 


Basis  recovery  using  a  primal  quantitative  approach 

The  recovery  of  a  superior  or  an  optimum  basic  solution  from  a  primal  feasible  non-extreme 
point  solution  is  well  understood  in  the  context  of  the  SSX  algorithm.  For  instance  the 
BASIC  procedure  within  MPSX  finds  a  superior  extreme  point  solution  and  the 
corresponding  basis.  Mitra  and  Tamiz  (88),  Marsten  (89),  and  others  set  out  comparable 
pivotal  schemes  for  a  primal  only  approach  to  this  problem.  The  steps  of  this  algorithm  are 
summarized  below: 

Let  X*  be  the  primal  feasible  (or  optimal)  non-extreme  point  solution  found  by  the  interior 
point  method.  We  partition  the  A  matrix  into  three  parts  [B,  N,  S\  and  the  corresponding 
variable  or  column  indices  into  three  sets  [Ig,  /,] .  These  correspond  to  basic,  nonbasic 
variables  at  their  bounds  and  superbasic  variables.  Let 

x*=[x;,  x;,  x;]’’  then  we  can  express  the  original  LP  system  of  equation  as  : 

Ax'=b  :  Bxi*Nx;^*Sxj=b  (3.1) 

we  also  set 

if  Xj<€  then  Xj=0  (3.2) 

and  re-express  the  system  (3.2)  as: 
x;=B-'(b-f/x;)-B-'sx; 

Note  that  since  no  upper  bound  is  defined  Nxl^=0. 


(3.3) 
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To  this  equation  representation  of  the  problem  we  apply  a  "reverse  simplex"  algorithm  in 
which  the  number  of  variables  which  are  nonzero  are  reduced  from  |/j|  +  |/j|  to  |/,|  in  |/j| 

pivotal  steps  and  the  corresponding  x/  variables  moved  to  their  bound  depending  on  their 

reduced  cost  coefficients;  for  a  detailed  explanation  of  this  procedure  see  (Mitra  88).  This 
method  was  also  implemented  within  the  first  versions  of  OBI  (Marsten  89-2)  and  OSL 
(Forrest  90).  The  experimental  results,  however,  were  rather  disappointing.  On  average  the 
basis  recovery  steps  took  15-30%  of  IPM  time  and  sometimes  it  exceeded  the  1PM  time. 
The  jifficuity  experienced  in  the  basis  recovery  (SSX)  steps  can  be  ascribed  to  the  cases 
where  the  optimal  solution  is  primal  or  dual  degenerate. 

Recendy,  Bixby  and  Saltzman  (92)  analyzed  the  above  approach  and  suggested  several 
extensions.  They  attribute  the  slow  convergence  of  primal  cross-over  algorithm  to  one  of 
the  tv/o  following  reasons; 

(i)  I-et  X  *  )  be  the  partition  of  the  IPM  solution  to  basic,  superbasic  and  non  basic 

variables.  After  the  initial  basis  is  constructed,  a  basic  solutionx^=J5’'(i>-NJt/J-5JC5)=B'‘6 
is  computed.  There  is  no  guarantee  that  this  solution  is  numerically  stable  and  thus  the 
residual  b-BXg  can  be  unacceptably  large.  Even  if  this  residual  is  small,  the  computedXj 
can  be  a  bad  approximation  to  the  original  Xg. 

(iii  The  variables  that  are  supierbasic  but  were  not  included  in  the  basic  solution  are  fixed  to 
their  bounds.  The  computed  Xg  may  be  highly  sensitive  to  these  perturbations. 

To  over  come  these  difficulties,  Bixby  and  Saltzman  propose  the  following  procedures: 

(i)  Construction  of  the  initial  basis 

The  variables  Xj  such  that  x^  are  sorted  in  a  decreasing  order,  a  variable  is  considered  to 
be  0  if  x^<10'* .  The  initial  basis  is  constructed  from  the  first  m  variables  in  the  list,  the  rest 
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of  the  variables  that  are  not  set  to  zero  are  considered  as  superbasic. 

(ii)  Singularity  tolerance 

The  LP  is  scaled  such  that  maxyla^lsl,  the  standard  singularity  tolerance  t  =  10**  which 
gives  the  minimum  of  an  acceptable  pivot  entry  in  the  SSX  algorithm  is  set  to 
T  =  10'^.  Columns  with  no  acceptable  pivot  value  are  rejected.  If  this  or  the  originai 
selection  results  in  an  incomplete  basis  then  non  basic  siack  variables  that  corresponc  ;c 
uncovered  rows  are  included  in  the  basic. 

(iii)  Feasibility  check 

After  the  initial  basis  is  constructed  using  steps  (i)  and  (ii)  is  computed.  If  the  scaled  sum 
of  infeasibilities  is  more  then  1.0,  the  basis  is  rejected  and  step  (ii)  is  repeated  with  t=0.1  . 

The  basis  constructed  by  this  heuristic  procedure  is  used  in  an  algorithm  similar  to  the  one 
used  by  Marsten  et  al.  and  Mitra  et  al. 

The  results  reported  by  Bixby  et  al.  indicate  that  their  heuristic  improves  the  primal  basis 
recovery  algorithm  dramatically.  In  average,  the  reported  cross-over  times  are  5%  of  the 
total  solution  time. 

Basis  recovery  using  a  primal  dual  quantitative  approach 

Megiddo  (91)  proved  that  from  theoretical  point  of  view,  a  cross-over  algorithm  which  uses 
both  primal  and  dual  information  is  preferable.  This  idea  is  encapsulated  in  the  following 
theorems: 

Theorem  3. 1:  If  there  exists  a  strongly  polynomial  time  algorithm  that  finds  an  optimal 
basis,  given  an  optimal  solution  for  either  the  primal  or  the  dual,  then  there  exists  a  strongly 
polynomial  algorithm  for  the  general  LP  problem. 

Theorem  3.2:  There  exists  a  strongly  polynomial  time  algorithm  that  finds  an  optimal  basis, 
given  optimal  solution  for  both  the  primal  and  the  dual. 
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Megiddo  gives  a  constructive  proof  of  Theorem  3.2  and  shows  that  there  is  no  known 
strongly  polynomial  algorithm  to  retrieve  the  optimal  basis  if  we  take  into  consideration  only 
the  primal  or  the  dual  optimal  interior  solution.  This  explains  the  importance  of  problem 
liii)  stated  above.  Megiddo’s  jjrocedure  which  takes  into  account  both  primal  and  dual 
optimal  solutions  values  is  given  in  algorithm  3.1. 

Algorithm  3.1:  recovering  an  optimal  primal  dual  basis 

Let  (x  *,  y  V  *)  be  the  IPM  optimal  solution  of  the  problem  stated  in  (2.1)  and  assume  that 
A  is  of  full  rank  (the  rows  of  the  matrix  A  are  linearly  independent).  Our  aim  is  to  construct 
an  optimal  basis  solution  from  the  optimal  interior  poiru  solution. 

Let  be  the  part  of  the  constraint  matrix  which  corresponds  to  the  variables  that  are 

active  at  the  solutions  :  Aj^.=[a.\  Oj  is  a  column  of  A  and  x*>0] .  Similarly,  let  Ay.  be  the 
part  of  the  constrairu  matrix  consisting  of  columns  of  A  which  correspond  to  dual  slacks  that 
are  0  in  the  solution:  Ay.=iaj\aj  is  a  column  of  A  and  z/=0|  (these  sets  are  identical  if 
the  problem  is  non  degenerate). 

From  duality  theory  and  the  complementarity  slackness  conditions: 
z.\^0,(y  ‘)^Ay.=cj^. 

1.  Construct  a  minimal  size  primal  solution 

Reduce  the  size  of  the  submatrix  A^.  to  create  a  linearly  independent  matrix  by  repeating 
the  following  procedure  until  no  reduction  is  possible: 


If  Ay.  is  linearly  dependent  then  there  exists  a  vector  t)  #0  such  that  =0  ^  Cy.x\  =0 
thus  for  some  scalar  t  we  construct  the  vector  x'  defined  by  the  relation: 


t  I 

x  :  = 


0  otherwise 


VJ=l,..pt  such  that  3j:  Xj+tr\j*0 


The  vector  x'  is  stiU  an  optimal  solution  but  with  a  smaller  set  of  positive  indices: 
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The  result  of  this  procedure  is  the  submatrix  Ajii={ajx'j>0)  whose  columns  are  all  linearly 
independent. 

If  \Aj^i\=m  then  set  GOTO  step  4 

2.  Add  variables  according  to  the  dual  problem: 

While  (there  is  a  column  in  Ay.  which  is  independent  of  the  columns  of  Ay,  )  do 
add  the  column  to  Ay, 

(the  linear  independence  of  the  column  can  be  easily  checked  by  using  Gaussian 
elimination) 
erul  while 

If  \Ay,\  =m  then  set  B=Ay,,  GOTO  step  4 


3.  Add  more  variables  by  dual  range  check 

Since  A  is  of  full  rank  and  then  there  must  exist  at  least  one  column  which  is 

independent  of  the  columns  of  Ay,.  Let  Oy  be  such  a  column  then  Oj  cannot  be  in  the 
original  Ay.  or  Ay. ,  hence  a^ly  ’YA<Cy  For  such  a  column  a.  we  solve  the  following 
system  of  equations: 


A  solution  for  such  a  system  exists  because  the  system  is  linearly  independent. 

Since  every  column  of  Ay.  is  now  a  linear  combination  of  columns  of  Ay,,  then  for  every 

scalar  t  we  have:(y’ -tD^Ay.=Cy.. 


We  fix: 


fo=imn< 
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The  submatrix  Ay,  contains  at  least  one  column  which  is  linearly  independent  of  the  columns 
of  Ay,.  These  columns  are  added  to  Ay,  one  at  a  time  in  the  same  way  as  in  step  2. 

This  process  is  repeated  until  \Ay,\=m,  then  set  B=Ay, 

4.  B  is  the  desired  primal  and  dual  basis 

Throughout  the  process  described  in  algorithm  3. 1  the  primal  and  dual  optimality  conditions 
of  the  intermediate  solutions  x',y';z'  are  maintained  such  that  Va^C/4^/  x\=0.  Variants 

cf  Algorithm  3. 1  were  implemented  by  Forrest  (91)  in  the  IBM  OSL  library  and  by  Lustig 
i9I-n  in  OBI.  These  implementations  proved  that,  from  the  practical  point  of  view,  this 
primal  dual  procedure  is  more  efficient  than  the  primal  only  basis  recovery  procedure. 

Basis  recovery  using  a  primal  qualitative  approach 

We  now  describe  a  qualitative  method  that  we  have  developed  for  the  crossover  procedure. 
This  procedure  was  especially  developed  to  utilize  the  prediction  of  the  optimal  solution  using 
active  and  dormant  sets  as  described  in  the  previous  section.  Instead  of  predicting  the 
optimal  solution  and  performing  another  1PM  iteration  to  verify  it,  we  use  the  prediction 
inside  the  SSX  algorithm. 

Given  the  original  LP  problem  P  we  create  a  related  problem  P  in  the  following  way.  In 
addition  to  Ig,  /j  column  index  sets  defined  in  the  primal  retrieval  procedure  we  define/ip 
row  index  sets  corresponding  to  zero  and  positive  logical  respectively.  The  relationship  of 
the  indicator  sets  to  these  sets  are  given  as  o*(x)=/,U/j,  a'(,z)=Rg. 

We  construct  the  related  problem  P  as  shown  in  Figure  3. 1  by  fixing  variables  in  the  set 
to  their  respective  upper  or  lower  bounds  and  making  the  rows  in  the  set  R^  free  rows.  We 
note  that  if  the  IPM  prediction  is  correct  then  the  given  optimum  solution  to  P  is  a  feasible 
solution  to  P.  The  basis  recovery  procedure  is  stated  in  algorithm  3.2: 
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In 


R, 

Figure  3.1  :  construction  of  the  restricted  problem 
Algorithm  3.2 

1.  Create  a  starting  basis  for  P  using  a  the  simplex  CRASH  procedure  (Forrest  90,  NAG 
91). 

2.  Solve  P  by  a  SSX  algorithm,  save  the  basis  B~^{P) 

3.  Reinstate  the  original  problem  P 

start  with  and  apply  SSX  to  optimality. 

If  the  prediction  of  the  optimal  basis  is  slightly  wrong  (as  can  happen  if  IPM  is  terminated 
before  an  optimal  solution  is  reached)  then  step  2  is  terminated  with  a  ’no  feasible  solution’ 
status.  The  resulting  basis,  however,  is  usually  near  optimal  and  thus  requires  a  low 
number  of  iterations  in  step  3  to  reach  optimality.  ^ 

4.  Results  and  analysis 

Figures  4.1,  4.2  and  4.3  illustrate  the  behaviour  of  our  primal-dual  indicators.  Figures  4. 1 
and  4.2  show  the  behaviour  of  primal  and  dual  Tapia’s  indicators  for  two  dormant  variables 
(var  3  and  var  4)  and  two  active  ones  (var  1  and  var  2)  on  the  problem  Stair.  As  expected, 
in  Figure  4. 1  the  indicators  which  represent  the  active  variables  converge  to  1  while  those 
which  represent  the  dormant  ones  converge  to  0.  Figure  4.2  is  the  dual  mirror  image  of 
figure  4.1.  In  it,  the  dual  slacks  of  the  active  primal  variables  converge  to  0  while  the  dual 
slack  variables  of  the  dormant  primal  demonstrate  consistent  growth.  The  convergence  is 
monotonous  only  in  the  last  stages  of  the  algorithm  and  for  some  problems  a  sharp  separation 
between  the  sets  is  not  always  easily  derived  (especially  if  some  upper  bound  variables  are 
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also  present).  In  comparison,  the  behaviour  of  the  indicators  in  Figure  4.3  that  are  based 
on  equation  (4.10)  show  an  earlier  as  well  as  stronger  separation.  Our  observations  show 
that  soon  after  feasibility  is  reached,  many  of  the  active  primal  variables  show  a  rapid  growth 
for  several  iterations.  This  growth  is  closely  linked  to  the  rapid  arrival  of  IPM  to  a  near- 
cptimal  solution;  then,  the  growth  of  the  primal  variables  is  followed  by  a  similar  reduction 
in  the  value  of  the  dual  slack  variables  and  the  indicator  shows  an  exp>onential  growth  from 
iteration  to  iteration.  This  growth  is  reduced  when  the  solution  reach  optimality. 

The  behaviour  of  the  primal,  dual  and  our  primal  dual  indicators  are  tested  on  four  variables 
of  the  problem  Stair  as  illustrated  in  Figures  4.1,  4.2  and  4.3 


Problem  Stair 

f04r  ertaai  vvriaiftiM 


V 
$ 

M 

V 


a  4  *  var  t  o  var  >  a  vtr  ^ 


Figure  4.1  Primal  indicators  the  problem  Stair 
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Problem  Stair 

■aMvioir  or  foir  Mi  «orftf>loo 


H  M  M  H  I  {  n  I  iS  1  4  1  ai  !  »  T 

2  4  8  •«1t14  10  18t0Z2M 

I4«r«t  low 


Figure  4.2  Dual  indicators  in  the  problem  Stair 


Problem  Stair 

— Hoviotr  or  eri«oi*ouoi  inoiootoro 


10  1?  14  It  ia  ra  Z2 

itoroiiooo 


a  v«r  1  ♦  vw  t  o  vor  t  o  o 

Figure  4.3  Primal-dual  indicators  in  the  problem  Stair 

In  our  identification  heuristic,  we  use  a  combination  of  the  prima  dual  indicators  and  those 
of  (2. 10)  to  identify  the  active  and  dormant  sets.  A  special  array  predict(ncol)  holds  an 
integer  value  that  is  proportional  to  growth  in  the  indicator  value.  If  the  value  is  increased 
by  a  sufficient  amount  for  more  than  one  iteration,  the  corresponding  variable  is  marked  as 
active.  If  the  value  is  decreased  by  the  same  amount,  the  corresponding  variable  is  maiiced 
as  dormant. 
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l!i  Table  4.1  and  Table  4.2  we  present  the  results  of  our  identification  heuristic  in 
intermediate  stages  of  the  IPMC  algorithms  for  two  NETLIB  problems.  ITie  table  shows 
the  number  of  active  and  dormant  variables  which  are  identified  by  the  heuristic  (the  column 
marked  Fore.)  and  compare  them  to  those  found  at  optimality.  The  number  of  hits  and  misses 


in  every  iteration  is  also  given. 
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The  tables  indicate  that  most  dormant  variables  and  some  active  variables  can  be  recognized 
almost  as  soon  as  feasibility  is  reached.  This  property  of  the  algorithm  is  later  used  for 
basis  recovery  and  for  reducing  the  model  size  dynamically  by  removing  these  dormant 
variables. 


In  Table  4.3  we  demonstrate  the  qualitative  approach  on  the  problem  OR3  (algorithm  3.2). 
Apart  from  the  only  SSX  run  (first  row  of  the  table)  and  cross  over  from  optimality  fiasi  row 
of  the  table)  the  solution  set  is  determined  by  the  indicator  heuristic. 

The  "IPM  Iterations"  column  gives  the  iteration  number  at  which  cross  over  was  made.  The 
column  marked  "Basis  recovery  Pass  I"  gives  the  number  of  SSX  iterations  on  the  rcsirictsd 
problem,  the  column  marked  "Basis  recovery  Pass  11"  gives  the  number  of  SSX  iterationE 
needed  to  prove  the  optimality  of  the  solution  for  the  complete  problem.  We  note  that  the 
predictions  of  the  14  and  16  iterations  did  not  find  the  exact  optimal  set  but  as  expected, 
they  where  close  enough  to  the  primal  basis  to  cause  a  considerable  reduction  in  the  number 
SSX  iterations.  In  Table  4.3  we  give  the  qualitative  basis  recovery  results  for  some  more 


NETLIB  problems. 


Model 

IPM 

Iterations 

Time 

(sec) 

OR3 

0 

0 

OR3 

14 

10.5 

OR3 

16 

12 

OR3 

18 

13.5 

OR3 

20 

15 

OR3 

22 

16.5 

OR3 

24 

18 

Basis  recovery 

Pass  I 

Pass  II 

1765 

200' 

1480 

40' 

680 

100 

340 

180 

71 

240 

6 

220 

13 

Time 

Total 

Time 

354 

354 

365  365 


216  227 


timal  solution  not  round  in  the  restneted  problem 

Table  4.3  quantitative  crossover  results 
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Model 

Iteration 

Variable 

Time 

Pass  I 

Pass  n 

Time 

Total 

Ganges 

26 

1223 

87 

336 

379 

33.8 

120.80 

1223 

101.1 

405 

187 

28.6 

129.70 

25fv47 

39 

773 

214.61 

267 

206 

25 

239.61 

25fv47 

44 

770 

242 

331 

124 

23.8 

265.80 

Shipl21 

19 

736 

29 

39 

159 

23 

52.00 

Shipl21 

26 

726 

39.78 

38 

67 

18.5 

58.28 

Cre  a 

45 

977 

185 

80 

2423 

224 

409.00 

Table  4.4  Basis  recovery  results 
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Abstract 

In  this  paper  a  new  algorithm  is  proposed,  based  upon  llie  idea  of 
modeling  the  objective  function  of  a  global  optimization  problem  as  a 
sample  path  from  a  Wiener  process.  Unlike  previous  work  in  this  held, 
in  the  proposed  model  the  parameter  of  the  Wiener  process  is  considered 
as  a  random  variable  whose  conditional  (posterior)  distribution  function 
is  updated  on-line.  Unfortunately,  using  a  natural  conjugate  prior  dis¬ 
tribution  on  such  a  parameter,  consistency  of  the  Bayesian  algorithm  is 
lost.  The  authors  propose  a  modified  prior  distribution  which  overcomes 
this  difficulty.  Furthermore,  stopping  criteria  for  Bayesian  algorithms  are 
discussed. 

Introduction 

Let  us  consider  the  global  optimization  problem  defined  a.s  the  problem  of  finding 
/*  in  such  a  way  that 

r=ma.\f(x)  (1) 

where  A'  C  R"  is  a  compact  set  and  /  is  a  continuous,  real-valued  function  de¬ 
fined  over  K.  Many  algorithms  have  been  proposed  in  the  literature  for  dealing 
witli  the  rather  frequent  situation  in  which  /  possesses  many  local  optima.  For 
a  general  survey  the  intersted  reader  is  referred  to  (Torn  k  Zilinskas,  1989). 
Among  the  algorithms  which  deserve  special  attention  are,  in  the  authors’  opin¬ 
ion,  those  based  on  stochastic  elements:  research  in  this  area  may  be  roughly 
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categorized  in  two  main  streams,  namely  research  based  on  the  idea  of  modeling 
the  objective  function  /  as  a  sample  path  of  a  stochastic  process  and  research 
based  on  the  introduction  of  a  random  element  in  the  algorithm  itself.  Algo¬ 
rithms  belonging  to  the  former  stream  have  been  recently  surveyed  in  (Betro, 
1991),  while  those  belonging  to  the  latter  are  surveyed  in  (Schoen,  1991).  A 
very  good  account  of  classical  as  well  as  recent  results  in  both  areas  can  be 
found  in  (Zhigljavsky,  1991). 

In  this  paper  attention  will  be  restricted  to  the  class  of  objective  functions 
defined  over  a  closed  interval  of  the  real  axis.  Such  a  restriction  is,  from  the 
theoretical  point  of  view,  admissible  in  view  of  the  fact  tliat  multidimensional 
global  optimization  problems  can  be  transformed  into  one-dimensional  ones  e  g. 
by  means  of  Peano-mappings  (Strongin,  1992).  The  authors  are  obviously  aware 
of  the  computational  difficulties  inherent  to  such  transformation;  however  the 
■inalysis  of  one-dimensional  global  optimization  problems  is  already  sufficiently 
complex  to  be  wortiiwliile,  and.  in  the  authors'  opinion,  it  can  shed  light  into 
the  challenging  problem  (1). 

One  of  the  most  well-known  and  best  performing  method  for  one-dimensional 
global  optimization,  known  as  the  Bayesian  approach,  was  introduced  in  (Mockus, 
1975)  and  extensively  reviewed  in  (Mockus,  1989). 

The  main  idea  of  the  Bayesian  approach  is  that  of  considering  the  objective 
function  /  as  a  sample  path  of  a  stochastic  process.  Following  the  traditional 
.scheme  of  Bayesian  analysis  a  loss  function  is  introduced  and  the  ‘decision "  to 
be  taken  at  each  iteration  of  the  algorithm,  namely  the  choice  of  a  point  x  6  A’ 
where  to  evaluate  /,  is  made  according  to  the  minimization  of  the  expected  loss, 
or.  ill  Bayesian  terminology,  risk. 

1  Sequential  Bayesian  optimization 

Let  tlie  objective  function  /  be  considered  as  a  sample  path  of  a  stochastic 
process  A(x;ui);  i.e.,  we  a.ssume  that  there  exists  an  w  G  such  that  f{x)  = 

)  V  X-,  a  suitable  probability  space  over  the  sample  space  Q  is  assumed  to  be 
defined.  The  idea  of  Bayesian  algorithms  for  problem  (1)  is  that  the  '‘decision- 
maker"  should  choo.se  an  “action”  a,  which,  in  the  present  context,  corresponds 
to  choosing  a  point  where  to  evaluate  /,  trying  to  minimize  expected  value  of  a 
suitably  defined  “loss  function”.  The  reader  is  referred,  for  example,  to  (Berger. 
1985)  for  a  detailed  account  of  Bayesian  analysis  and  terminology.  Let  us  assume 
that  71  observations  of  /  have  l)een  already  performed  in  correspondance  with 
the  points  xi  <  <  and  let  /*  =  max,  =  i  „ /(z, ).  A  natural  form  of  the 

loss  function  for  problem  ( 1 )  is  given  by 

J5f(u7,a)  =  max  A(y;w)  -  ma.\{F{a,uj),  f^} .  (2) 

y 

The  above  loss  function  introduces  a  measure  of  the  loss  incurred  when  an 
action  consisting  in  the  evaluation  of  the  objective  function  at  a  €  A'  is  taken 
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when  the  objective  function  /(•)  is  F{-;u);  it  is  assumed  that  /  has  already 
been  evaluated  in  n  points. 

In  most  of  the  published  papers  on  this  subject  it  is  assumed  that  a  fixed 
number  of  observations,  N  >  0,  can  be  performed.  However  it  seems  more 
sensible  to  assume  that  the  disadvantage  of  having  to  evaluate  the  (presumably) 
expensive  function  /  is  directly  included  in  the  definition  of  the  loss  function. 
We  propose  thus  the  use  of  the  following  loss: 

^{(jj,a,n)  =  m^F{y,u>)  -  max{F(a;a;), /^}  +  (rH-  l)c  (3) 

where  c  >  0  measures  the  cost  of  each  function  evaluation;  here,  as  before,  action 
a  6  A'  is  chosen:  having  already  evaluated  f  in  n  points,  a  further  observation 
at  a  gives  a  total  of  7i  +  1  observations,  each  at  cost  c. 

Let 

.  .,XkJiXk)) 

be  the  information  available  after  the  k-ih  observation,  with  jq  =  0-  A  sequen¬ 
tial  decision  is  thus  defined  as  a  function 

d  =  (r,6) 


where 


T’  =  To,ri(2i),...,rt(zfc),... 

^  =  ^OySiizi), . . .  ,6t(zk),  ■■  ■ 

Functions  r*  map  the  available  information  at  step  k  into  one  of  the  actions 
“stop”  and  “don’t  stop”;  functions  6k  give  the  decision  on  where  to  choose  the 
next  observation  point. 

Altough  theoretically  it  is  possible  to  define  an  optimal  sequential  decision 
rule,  it  is  practically  impossible  to  identify  a  closed  form  expression  or  even 
a  computationally  manageable  optimal  rule.  It  is  thus  very  common  practice 
to  implement  a  so-called  rolling-horizon  or  ik-step  look-ahead  rule.  Given  the 
practical  difficulty  of  even  these  decision  strategies,  one  is  usually  left  to  the 
use  of  1-step  look-ahead  rules,  i.e.  rules  which  are  optimal  in  the  subset  of 
decision  rules  which  prescribe  stopping  not  later  than  the  next  observation. 
The  “rolling-horizon”  feature  of  this  method  arises  as,  in  the  implementation,  if 
the  rule  calls  for  stopping  then  the  algorithm  terminates,  whereas  if  the  rule  calls 
for  one  more  observation,  that  observation  is  taken  and  a  new  1-step  look-ahead 
rule  is  implemented. 

In  the  present  context  the  bayes  risk  of  a  decision  (r„,6„)  is  given  by 

F(maxF(y;u;)  -  /^  -h  nc  |  V2„)  if  r„  =  “stop” 

£'(maxF(y;u;)  -  max{F(^n; w). /It}  +  («  +  l)c  |  Zn)  otherwise 

(4) 


>’( 6n)  — 
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where  E  stands  for  expectation. 

A  one-step  look-ahead  rule  has  the  form: 

S*  =  argmin  E  (  maxF(y;u')  -  max{F(x;u),  -f-  (ri  -t-  1  )c  |  2,,  vt) 
16/C  V  _  / 

and  r*  =  “stop”  if  and  only  if  the  current  loss 

E  (^&xF{y\u>)  -  nc  \  Zn  j 
is  less  than  or  equal  to  the  predicted  one; 


E  ^maxF(y;tj)  -  tnax{F(^*;u;), )  +  (fJ  +  1  )c  1^',,^  la) 

It  is  easy  to  see  that  the  previous  optimal  decision  problem  decomposes  into 
the  problem  of  optimally  placing  the  next  observation,  which  is  equivalent  to 
finding 

(5*  =  argmaxF(max{O.F(x;w)-  f^}  j  2„ )  (7) 

x6  A 

and  then  deciding  to  stop  computation  (and  not  evaluate  /  at  x,i  +  i  =  i'Z/,  if 
and  only  if 

F(max{0,F(6;;w)  -  Z;^}  1  Zr,)  <  c  (8) 

Stopping  rule  (8)  gives  further  insight  into  the  meaning  of  c  which  c?.n  be 
considered  as  a  treshold  for  the  gain  which  is  expected  in  performing  one  more 
observation:  if  this  expected  improvement  falls  below  c  it  is  considered  not 
wortwhile  continuing  the  sample. 

In  order  to  specify  a  practical  algorithm  an  appropriate  probabiiity  inea.snre 
has  to  be  defined;  this  is  the  subject  of  the  next  section. 


2  Wiener  process  with  unknown  parameter 

A  standard  probabilistic  model  for  the  objective  function  /  when  A'  is  an  iiitei  vai 
on  the  real  line  is  the  Wiener  process.  In  the  following  we  shall  assume  that 
I\  =  [0,  1).  Although  this  process  possess  an  undesirable  feature,  namely  almost 
everywhere  non  differentiability,  computational  tractability  makes  it  almost  the 
only  feasible  choice.  We  say  that  a  stochastic  process  F(i  ;u;)  is  a  Wiener  nrocess 
if 


•  F(0;u;)  =  /o,  a  constant; 

•  F(x;  )  is  a  random  variable  with  distribution  ^ ) '(/o,  a-'x),  where  a  1.5  a 
parameter  of  the  process. 

•  F(x;  )  has  independent,  stationary  increments. 
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In  the  literature  the  parameter  a  is  assumed  to  be  known  or,  in  some  cases, 
estimated  by  means  of  an  initial  sample  from  /.  In  this  paper  the  Bayesian 
paradigm  will  be  assumed  and  a  will  be  considered  as  a  random  variable  which 
has  to  be  estimated  on-line. 

As  a  consequence  of  the  normality  assumption  for  F{x\  •),  it  is  only  too 
natural  to  assume  as  a  prfor  distribution  on  a  a  “natural  conjugate  prior”: 
definitions  and  examples  of  conjugate  classes  of  distributions  can  be  found  in  any 
text  on  Bayesian  statistics.  As  the  observations  of  a  Wiener  process  are  gaussiaii 
random  variable  whose  variance  is  proportional  to  the  unknown  parameter  tr*, 
a  natural  choice  for  the  prior  distribution  comes  from  the  observation  that,  in 
an  independent  normal  process  {A'',}  with  known  mean  and  variance  <t^,  the 
random  variable 

o  o 

a-  (J^ 

is  distributed  as  a  x*  random  variable  with  n  degrees  of  freedom.  Inverting  the 
distribution  it  is  thus  possible  to  define  a  family  of  prior  probability  distribution 
functions  for  a-  which  depend  on  two  parameters,  ao  representing  the  initial 
degrees  of  freedom  and  So  (the  particular  distribution  goes  under  the  name  of 
inverted-gamma  function),  given  by 

p(£r-';ao,So)  =  if(rr-;ao.So)  oc  exp(-So/(2o-^))(«7-')^““'^‘'^'.  (9) 

It  is  then  possible  to  show  (the  detailed  proofs  of  all  the  original  results 
in  this  paper  will  appear  elsewhere)  that,  given  a  sample  from  the  Wiener 
process  F  the  posterior  distribution  of  cr^  is  given  by 

p(<7- I  i„;ao,So)  =  3(cr^;a„,S„)  (10) 

where 


Sn 

Ax  I  in-l) 

^'(i-  I  -'-.-l) 


«n-l  +  1 


o  .  f  fiXn}-  AXn  Un-li 
E{F{x\u)  I  r„_i) 


h 

•'* 

/»-! 


if  I  <  ii 

if  x,_  I  <  X  <  Xi 

it  X  >  Xn-l 


if  1-  <  i’l 
if  1,-1  <  X  <  Xi 
if  X  >  x„_i 


Thus  the  posterior  distribution  of  the  parameter  of  the  Wiener  proce.ss  can 
be  updated  sequentially  by  means  of  the  above  formulae. 
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3  One-step  look-ahead  algorithm 

The  choice  of  the  next  observation  point  is  thus  reduced  to  the  maximization 
of  the  expected  gain 

=  £:(max{a,F(i;u;)-/:}|z„)  (11) 

=  £;(E(max{0,F(x;w)- I  1  z„)  (12) 

where  the  outermost  expectation  is  with  respect  to  the  conditional  distribution 
of  <T^  given  the  sample.  It  is  possible  to  show  that,  at  leeist  when  a„  is  even, 
Tx{f^)  can  be  given  in  explicit  form.  Obviously,  as  this  case  generalizes  the 
usual  one  with  <7^  a  priori  known,  explicit  maximization  of  the  expected  gain 
Tx  is  impossible  and  one  has  to  resort  to  numerical  optimization;  however  it  has 
been  shown  in  (Locatelli,  1992)  that  several  criteria  can  be  given  for  excluding 
subsets  of  [0, 1]  from  the  search  of  an  extremum  of  Tx,  and,  in  many  cases, 
conditions  have  been  given  by  which  such  exclusion  is  permanent;  in  other 
words  it  is  possible  to  show  that  if  the  expected  gain  in  a  certain  sub-interval 
of  [0, 1]  falls  below  a  treshold  at  step  n,  the  Bayesian  algorithm  will  not  place 
any  new  observation  in  that  interval  before  stopping. 

For  what  concerns  stopping  of  the  algorithm,  the  one-step  look-ahead  stop¬ 
ping  rule,  once  the  choice  of  the  new  observation  point  has  been  performed, 
is  trivially  obtained  as  that  rule  which  calls  for  stopping  as  soon  as  the  computed 
expected  gain  falls  below  c.  Thus  the  proposed  modification  of  the  Bayesian  ap¬ 
proach  permits  at  no  cost  to  regulate  the  sample  size  dinamically  on  the  bases 
of  the  observations. 

The  following  result  h2is  also  been  proven; 

Theorem.  If  f  is  a  continuous  function  on  [0,  1]  satisfying 

l/(i0  -  f{y)\  <  L\/\x  -  y|  Vx,ye[0, 1]  (13) 

where  L  >  0,  then  the  one-step  look-ahead  Bayesian  algorithm  will  almost  surely 
stop  after  a  finite  number  of  steps.  The  stopping  time  is  0{L~/c~). 

As  a  final  remark,  notice  that  only  existence  of  L  is  required  for  finite  stop¬ 
ping;  knowledge  of  the  constant  L  is  not  necessary. 

4  Consistency 

Although  it  has  been  proven  that  finite  stopping  always  occur,  the  question  nat¬ 
urally  arises  of  the  accuracy  of  the  proposed  algorithm.  It  should  be  clear  that 
in  looking  for  finite  stopping  rules  one  has  to  trade  accuracy  with  computational 
effort.  Nonetheless  it  seems  worthwhile  to  analyze  the  aisymptotic  behaviour  of 
the  algorithm  (as  c  — ►  0)  in  order  to  judge  its  consi.3tency,  i.e.  the  convergence 
of  the  estimated  optimum  to  the  true  one. 
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Unfortunately  it  is  possible  to  exhibit  counter-examples  showing  that,  for 
particular  choices  of  /,  the  Bayesian  algorithm  will  not  produce  a  dense  set 
of  observation  points,  thus  missing  some  subset  of  [0, 1]  with  positive  measure. 
This  situation  is  typical  for  any  Bayesian  optimization  algorithm  relying  upon 
an  estimate  of  the  variance  parameter  <r-.  In  practice  it  is  possible  to  exhibit 
functions  which  are  constant  in  [e,  1]  with  e  >  0  chosen  in  such  a  way  that  no 
observation  point  will  ever  be  placed  in  (0,  e);  the  estimate  of  a-  {any  consistent 
estimate,  not  just  the  one  proposed  in  this  paper)  will  rapidly  converge  to  0 
and  the  Bayesian  optimizer  will  become  more  and  more  convinced  that  his/her 
function  is  everywhere  constant. 

It  is  however  possible,  albeit  with  some  technical  difficulty,  to  restore  con¬ 
sistency  by  changing  the  prior  distribution  on  in  a  way  which  forbids  the 
estimate  to  go  to  0.  In  practice  one  chooses  a  small  treshold  €  >  0  and  redefines 
the  prior  in  such  a  way  as  to  give  positive  mass  only  to  <t^  6  [e,  -foo).  It  is  then 
again  possible  to  find  update  formulae  similar  to  those  in  (10)  and  to  prove 
results  on  finite  stopping  as  well  as  consistency.  The  details  of  this  modified 
algorithm,  as  well  as  the  proofs,  will  appear  elsewhere. 


Conclusions 

It  has  been  shown  that  it  is  possible,  with  only  little  increase  of  computational 
effort  with  respect  to  the  traditional  Bayesian  algorithm,  to  achieve  the  following 
goals; 

•  eliminate  the  necessity  of  pre-specifying  the  value  of  the  parameter  <t*; 

•  introduce  a  finite  stopping  criterion  by  which  the  sample  size  has  not  to 
be  specified  in  advance; 

•  guarantee  consistency. 
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Abstract 

We  consider  first  a  variant  of  the  Analytic  Hierarchy  Process  (AHP)  with  a  one-parametric 
class  of  geometric  scales  to  quantify  human  comparative  judgement,  and  with  a  multiplica¬ 
tive  structure:  logarithmic  regression  to  calculate  the  impact  scores  of  ihe  alternatives  at 
the  first  evaluation  level,  and  a  geometric-mean  aggregation  rule  to  calculate  the  final  s- 
cores  at  the  second  level.  We  demonstrate  that  the  rank  order  of  the  impact  scores  and  the 
final  scores  is  scale-independent.  Moreover,  we  show  that  the  multiplicative  AHP  is  an  ex¬ 
ponential  version  of  the  Simple  Multi-Attribute  Rating  Technique  (SMART).  In  fact,  the 
multiplicative  AHP  is  concerned  with  ratios  of  intervals  on  the  dimension  of  desirability, 
whereas  SMART  analyzes  differences  of  the  corresponding  orders  of  magnitude. 


1  Introduction 

We  are  concerned  with  two  well-known  methods  for  multi-criteria  decision  analysis;  the 
Analytic  Hierarchy  Process  (AHP)  and  the  Simple  Multi-Attribute  Rating  Technique 
(SMART).  They  have  primarily  been  designed  to  evaluate  a  finite  number  of  decision  al¬ 
ternatives  Ai, . . . ,  A„  under  a  finite  number  of  conflicting  performance  criteria  Ci, . . . ,  Cm, 
by  a  single  decision  maker  or  by  a  decision-making  body.  The  AHP  (Saaty  (1980),  see  also 
Zahedi  (1986)  and  French  (1988))  is  based  upon  pairwise  comparisons  of  the  alternatives 
and  the  criteria,  so  that  the  decision  maker’s  judgement  is  rather  fragmented.  SMART 
(see  von  Winterfeldt  and  Edwards  (1985)),  a  popular  off-spring  of  Multi- Attribute  Utility 
Theory  (MAUT),  proposes  a  direct  rating  procedure  enabling  the  decision  maker  to  keep 
a  more  holistic  view  on  the  decision  alternatives. 

Throughout  the  paper  we  illustrate  the  relationship  between  the  multiplicative  AHP  and 
SMART.  We  demonstrate  that  the  multiplicative  AHP  is  concerned  with  ratios  of  intervals 
on  the  axis  of  desirability,  whereas  SMART  uses  the  differences  of  the  corresponding  orders 
of  magnitude.  The  idea  was  suggested  by  the  familiar  mode  of  operation  in  psycho-physics, 
where  ratios  of  light  and  sound  intensities  are  expressed,  not  necessarily  in  their  original 
magnitudes,  but  in  orders  of  magnitude  on  the  deci-Bell  scale. 


2  Criticism  on  the  original  AHP 

Sziaty’s  original  version  of  the  AHP  has  been  criticized  for  various  reasons:  (a)  for  the 
fundamental  scale  to  quantify  human  judgement,  (b)  because  it  estimates  the  impact 
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scores  of  the  alternatives  by  the  Perron- Frobeni us  eigenvector,  and  (c)  because  it  calcu¬ 
lates  the  final  scores  of  the  alternatives  via  the  arithmetic-mean  aggregation  rule.  The 
controversial  issues,  to  be  treated  in  the  presentation,  are  not  new.  Only  a  few  years  ago, 
Zahedi  (1986)  signalized  that  the  criticism  on  the  AHP  concentrated  on  the  estimation  of 
the  impact  scores,  but  that  no  major  controversy  existed  concerning  the  aggregation  step. 
Criticism  on  Saaty’s  fundamental  scale'  was  not  mentioned  in  Zahedi’s  survey  paper,  but 
Belton  (1986)  brought  forward  several  arguments  against  the  sc<ile  and  the  aggregation 
rule. 

More  recently,  Barsilai  et  al.  (1987,1991)  observed  that  the  AHP,  since  it  is  essentially 
based  upon  ratio  information,  would  benefit  from  a  conversion  into  a  variant  with  a  multi¬ 
plicative  structure.  With  the  geometric  row  means  of  the  reciprocal  pairwise-comparison 
matrices  to  calculate  the  impact  scores,  and  with  a  geometric-mean  aggregation  rule  to 
calculate  the  final  scores  of  the  alternatives,  one  could  aggregate  in  two  different  ways 
without  affecting  the  final  scores:  either  by  combining  first  the  pairwise-comparison  ma¬ 
trices  into  one  matrix  from  which  one  obtains  the  final  scores,  or  by  combining  the  impact 
scores  under  the  respective  criteria  into  a  vector  of  final  scores.  By  these  multiplicative 
operations  one  avoids  rank  reversal  when  copies  of  alternatives  are  added  to  or  deleted 
from  a  consistently  assessed  set  of  alternatives  (a  deficiency  of  the  original  AHP). 

Barzilai  et  al.  (1987,  1991)  restricted  their  analysis  to  the  case  where  one  has  exactly  one 
estimate  for  each  pair  of  alternatives,  under  every  criterion.  Similarly,  they  did  not  address 
the  question  of  how  to  scale  the  decision  maker’s  verbal  judgement.  These  issues  have 
been  our  concern  since  the  early  eighties.  We  proposed  logarithmic  regression  in  order  to 
handle  missing  as  well  as  multiple  estimates  (the  author  (1982,  1987,  1990);  note  that  the 
regression  problem  reduces  to  the  calculation  of  geometric  row  means  when  there  is  exactly 
one  estimate  for  each  pair  of  alternatives),  and  we  introduced  a  one-parametric  class  of 
geometric  scales  in  order  to  quantify  the  judgemental  statements  expressing  the  opinions 
of  the  decision  makers.  We  could  demonstrate  that  the  rank  order  of  the  impact  scores 
is  scale- independent.  For  a  detailed  analysis  we  refer  the  reader  to  the  author’s  report 
(1991).  In  the  presentation,  we  explain  the  choice  of  values  for  the  scale  parameter  via 
psycho-physical  arguments,  and  we  show,  via  a  new  definition  of  the  criterion  weights,  ttiat 
the  rank  order  of  the  final  scores  is  also  scale-independent  in  a  multiplicative  structure. 

3  The  multiplicative  AHP 

In  summary,  we  propose*a  multiplicative  version  of  the  AHP  which  operates  as  follows. 
In  the  basic  experiment  at  the  first  evaluation  level,  where  two  alternatives  Aj  and  A*  are 
compared  under  the  criterion  Ci,  we  collect  the  preference  information  (indifference,  weak, 
strict,  strong,  or  very  strong  preference  for  one  of  the  two),  and  we  convert  the  verbal 
statement  of  the  decision  maker  (the  selected  gradation  of  his  comparative  judgement) 
into  a  numerical  value  on  a  geometric  scale,  that  is,  on  a  discrete  scale  with  echelons 
constituting  a  series  with  geometric  progression.  Next,  we  use  logarithmic  regression 
to  calculate  the  single-criterion  impact  scores  Vi{Aj),j  =  l,...,n,  approximating  the 
subjective  values  of  the  alternatives  under  criterion  C,.  The  impact  scores  are  not  unique. 
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They  have  a  multiplicative  degree  of  freedom,  and  they  can  accordingly  be  normalized  in 
such  a  way  that  they  sum  up  to  unity. 

The  basic  experiment  at  the  second  evaluation  level,  where  two  criteria  are  mutually  com¬ 
pared,  is  somewhat  more  complicated.  We  suggest  the  decision  maker  to  consider  two  real 
or  imaginary  alternatives  with-the  property  that  his  preference  for  one  of  them  under  the 
first  criterion  equals  his  preference  for  the  other  alternative  under  the  second  criterion. 
Next,  we  ask  him  to  state  whether  he  is  indifferent  between  the  two  alternatives  under 
the  two  criteria  simultaneously,  or  whether  one  of  the  two  criteria  gives  a  decisive  (weak, 
strict,  strong,  or  very  strong)  preference  for  one  of  the  two  alternatives.  Thereafter,  the 
judgemental  statements  are  converted  into  numerical  values  on  a  particular  geometric 
scale.  Logarithmic  regression  yields  normalized  weights  =  l,...,m,  for  the  re¬ 

spective  criteria.  Fin2Jly,  there  is  an  aggregation  step  generating  the  final,  multi-criteria 
scores  /(/!>)  via  the  geometric-mean  aggregation  rule 

m 

1=1 

where  c,  simply  denotes  the  weight  tv(Ci),  and  a  the  normalization  factor  to  guarantee 
that  the  final  scores  sum  up  to  unity.  By  these  quantities,  the  alternatives  are  unambigu¬ 
ously  ranked  in  a  subjective  order  of  preference  when  we  operate  with  geometric  scales. 
Moreover,  the  ratio  of  any  two  final  scores  does  not  depend  on  the  physical  or  monetary  u- 
nits  whereby  the  performance  of  the  alternatives  under  the  respective  criteria  is  originally 
measured. 

In  the  multiplicative  AHP,  the  gradations  of  comparative  judgement  are  put  on  a  scale 
with  geometric  progression.  Let  us  briefly  explain  the  underlying  reasons.  We  assume 
that  the  subjective  weighing  of  the  alternatives  under  a  particular  criterion  is  carried  out 
in  a  given  context  represented  by  an  interval  on  the  corresponding  axis  or  dimension. 
This  interval  is  partitioned  into  subintervals  which  are  felt  to  be  of  the  same  order  of 
magnitude.  Hence,  the  echelons  demarcating  the  subintervais  constitute  a  sequence  with 
geometric  progression;  the  property  is  well-known  in  psycho-physics,  see  Stevens  (1957). 
Marks  (1974),  Michon,  Eykman  and  de  Klerk  (1978),  Roberts  (1979),  and  Zwicker  (1982). 
Finally,  we  take  ratios  of  echelons  to  represent  ratios  of  subjective  values  and  we  let  them 
correspond  with  the  gradations  of  comparative  judgement.  This  enables  us  to  assign 
numerical  values  to  the  gradations.  Thus,  we  set 

rjt  =  exp(j6jk) 

where  6jk  is  an  integer- valued  index  designating  the  gradation  of  the  decision  maker’s 
judgement  as  follows: 
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-8  very  strong  preference  for  At  versus  Aj, 

•6  strong  preference  for  Ak  versus  Aj, 

-4  strict  (definite)  preference  for  At  versus  Aj, 

-2  weak  (mild,  moderate)  preference  for  At  versus  Aj, 

0  indifference  between  Aj  and  At, 

+2  weak  (mild,  moderate)  preference  for  Aj  versus  At, 

+4  strict  (definite)  preference  for  Aj  versus  At, 

+6  strong  preference  for  Aj  versus  At, 

+8  very  strong  preference  for  Aj  versus  At- 

Intermediate,  integer  values  of  6jt  designate  hesitations  between  two  adjacent  gradations. 
The  positive  parameter  7  is  the  scale  parameter  which  characterizes  the  scale,  and  exp(7) 
is  the  progression  factor. 

4  Relationship  with  SMART 

In  psycho- physical  measurement,  the  ratios  of  audible  sound  and  visible  light  intensities 
are  usually  recorded  as  differences  on  the  deci-Bell  scale.  This  means  that  not  the  ratio 
magnitudes  themselves  are  considered,  but  their  orders  of  magnitude.  The  observation 
suggested  us  to  assume  that  a  difference  of  grades  in  SMART  represents  tht  order  of 
magnitude  of  a  ratio  of  subjective  values  in  the  multiplicative  A  HP.  In  doing  so,  we  obtain 
a  simple  straightforward  relationship  between  the  two  multi-criteria  methods,  enabling  us 
to  carry  out  a  cross-validation  of  the  results.  Both  methods  are  now  incorporated  in  the 
REMBRANDT  system  of  L.  Rog  (Delft  University  of  Technology)  for  Ratio  Estimation 
in  Magnitudes  or  deci-Bells  to  Rate  Alternatives  which  are  Non-DominaTed.  Considering 
two  alternatives  Aj  and  At  under  a  given  criterion,  witn  the  respective  grades  gj  and  gt 
assigned  to  them,  we  take  the  quantity 

Tjt  =  €xp(7(gj  -gt)) 

to  represent  the  gradation  of  comparative  judgement  (normally,  we  use  the  grades  4,6,8,10 
to  designate  poor,  fair,  good,  and  excellent  performance,  the  symbol  7  stands  for  the  scale 
parameter).  Hence,  the  grades  assigned  to  the  alternatives  under  the  respective  criteria 
can  immediately  be  employed  in  the  multiplicative  A  HP.  The  user  can  even  work  with 
the  multiplicative  AHP  under  some  of  the  criteria,  and  with  SMART  under  the  remaining 
ones. 

In  the  psycho- physical  literature,  the  issue  of  how  human  beings  judge  the  relationship 
between  two  stimuli  was  brought  up  a  few  decades  ago.  Torgerson  (1961)  observed  that 
human  beings  estimate  differences  of  subjective  values  when  they  are  r'^quested  to  express 
their  judgement  on  a  category  scale  with  arithmetic  progression,  and  they  estimate  ratios 
of  subjective  values  when  the  proposed  scale  is  geometric.  Thus,  they  estimate  the  rela¬ 
tionship  as  it  is  required  in  the  experiment.  Which  of  the  two  interpretations  is  correct, 
cannot  empirically  be  decided  because  they  are  alternative  ways  of  saying  the  same  thing. 

Torgerson's  observation  is  easy  to  understand  if  we  assume  that  the  subjective  stimulus 
values  are  not  identically  used  in  the  two  types  of  experiments.  In  the  ratio  experiment 
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with  a  geometric  scale,  human  beings  judge  the  ratio  of  two  stimulus  values.  In  the 
difference  experiment  with  an  arithmetic  scale,  they  do  not  judge  the  ratio  itself  but  its 
order  of  magnitude,  which  is  essentially  a  logarithm  of  the  ratio.  Thus,  ratio  judgement 
is  exponentially  related  to  difference  judgement  (this  was  confirmed  by  psycho-physical 
research  in  the  seventies  and  eighties,  see  Veit  (1978)  and  Birnbaum  (1982)).  Moreover, 
the  multiplicative  AHP  and  SMART  do  the  same  thing  albeit  in  alternative  ways,  and 
they  are  exponentially  related. 

5  Choice  of  scale- parameter  values 

We  sketch  human  behaviour  in  various  areas  in  order  to  explain  the  numerical  values 
assigned  to  the  scale  parameter  and  henceforth  to  verbal  statements  such  as  weak,  strict, 
strong,  or  very  strong  preference  for  Aj  with  respect  to  A*.  To  our  knowledge,  this  ap¬ 
proach  to  explain  the  numerical  values  is  new.  First,  we  provide  a  heuristic  introduction 
to  illustrate  the  transition  from  car  prices  to  the  subjective  judgements  whereby  cars  are 
referred  to  as  “cheap”,  “somewhat  more  expensive”,  “more  expensive”,  or  “much  more 
expensive”.  In  fact,  we  subdivide  a  given  price  range  into  a  number  of  price  categories 
(intervals)  which  are  felt  to  be  of  the  same  order  of  magnitude,  and  we  use  the  correspond¬ 
ing  grid  points  (levels)  to  establish  ratios  of  price  increments  (echelons)  expressing  what 
we  mean  by  “somewhat  more”,  “more”,  and  “much  more”.  Next,  we  show  that  human 
judgement  leads  in  many  unrelated  areas  (progression  of  historical  periods  and  planning 
horizons,  classification  of  nations  according  to  size,  perception  of  light  and  sound  intensi¬ 
ties)  to  the  same  categorization  of  intervals;  there  are  roughly  four  major  categories,  the 
echelons  constitute  a  sequence  with  geometric  progression,  and  the  progression  factor  is 
roughly  4. 

We  use  these  results  in  the  REMBRANDT  system  to  introduce  a  natural  geometric  scale 
for  the  quantification  of  verbal,  comparative  judgement:  a  scale  with  major  as  well  as 
threshold  echelons,  and  the  progression  factor  2.  The  scale  parameter  7  is  accordingly 
set  to  the  value  In  2.  Sensitivity  analysis  with  a  short  and  a  long  geometric  scale  in  the 
neighbourhood  of  the  natural  scale  usually  shows  that  the  impact  scores  are  rather  stable. 
This  is  illustrated  by  the  numerical  example  at  the  end  of  the  presentation. 

Lastly,  we  show  that  the  relative  importance  of  the  criteria  can  also  be  established  via 
the  pairwise  comparison  of  two  alternatives.  By  this  new  approach,  we  found  only  one 
geometric  scale  to  quantify  the  relative  importance:  a  scale  with  major  and  threshold 
echelons,  and  with  progression  factor  \/2,  so  that  the  scale  parameter  7  is  set  to  In  y/2. 
Pairwise  comparisons  at  the  first  and  the  second  evaluation  level  will  accordingly  be  rather 
similar,  despite  the  conceptual  differences.  Moreover,  by  the  uniqueness  of  the  scale  at 
the  second  level  we  can  show  that  the  rank  order  of  the  final  scores  does  not  depend  on 
the  geometric  scales  employed  at  the  first  level. 
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6  Epilogue 

The  A  HP  was  intended  to  structure  a  decision  process  by  the  introduction  of  a  hierarchy 
of  evaluation  levels,  much  higher  than  the  two-level  model  considered  in  the  present  paper. 
Given  the  difficulties  encountered  in  the  aggregation  step,  a  hierarchical  structure  with 
more  than  two  levels  should  be  thoroughly  studied  before  it  is  launched  in  a  practical 
environment.  So  far,  we  only  formalized  the  concept  of  the  relative  importance  of  the 
criteria,  via  a  model  which  is  based  on  the  pairwise  comparison  of  alternatives.  In  a 
hierarchy  of  evaluation  levels,  we  would  run  up  against  the  relative  importance  of  sub¬ 
criteria,  sub-sub-criteria  etc.,  concepts  which  are  still  undefined.  The  original  version  of 
the  AHP  disregards  these  questions,  and  constructs  multi-level  hierarchies  as  audaciously 
as  it  carries  out  the  subsequent  analysis.  Such  a  top-down  approach  (see  als  Keeney  and 
RaifFa  (1976))  is  in  sharp  contrast  with  the  cautious  bottom-up  approach  followed  by 
the  French  school  in  multi-criteria  analysis  (Roy  (1985),  Scharlig  (1985),  Vincke  (1989)). 
Cross-fertilization  between  the  French  and  the  American  school  in  multi-criteria  analysis 
has  been  meagre,  however.  Well-known  Anglo-Saxon  textbooks  (von  Winterfeldt  and 
Edwards  (1986),  French  (1988))  ignore  the  French  school.  None  of  the  five  textbooks 
just  mentioned  gives  a  thorough  description  of  the  AHP,  by  its  world-wide  popularity 
a  prominent  method  in  the  American  school.  The  relationship  between  the  AHP  and 
SMART,  established  by  the  author  (1991),  may  enhance  the  power  of  the  two  methods, 
provided  that  they  are  jointly  embedded  in  a  flexible  decision-support  system. 
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INTRODUCTION 

Vehicle  routing  problems  with  time  windows  have  interested 
researchers  and  practitioners  for  some  years.  Time  window 
constraints  occur  in  for  example  newspaper  delivery,  delivery  of 
fresh  and  frozen  food,  dial-a-ride  services,  and  school  bus 
systems.  In  those  problems  one  or  several  time  windows  are 
connected  to  each  customer  imposing  earliest  and  latest  allowable 
start  for  delivery  at  the  customer.  Usually  time  windows  are  also 
connected  to  the  depot. 

The  time  windows  may  be  hard  meaning  that  a  solution  is  consi¬ 
dered  to  be  infeasible  if  the  time  window  constraints  are  not 
met.  If  the  time  windows  are  soft  is  is  allowed  to  deliver  to  the 
customer  outside  the  time  window.  However  a  penalty  is  then 
imposed.  In  this  paper  problems  with  one  hard  time  window  per 
customer  will  be  considered. 

Surveys  on  the  literature  on  VRPTW  are  given  in  Solomon  and 
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Desrosiers  (9)  and  in  Desrochers,  Lenstra,  Savelsbergh  and  Soumis 
(3).  The  exact  approaches  that  the  author  is  aware  of  can  be 
divided  in  the  following  four  classes: 

1.  Approaches  based  on  dynamic  programming.  This 
line  of  research  has  been  followed  by  Kolen, 

Rinnooy  Kan  and  Trienekens  (8)  and  can  be  re¬ 
garded  as  an  extension  of  the  Christof ides , 

Mingozzi  and  Toth  (1)  state  space  relaxation 
method.  Problems  with  up  to  14  customers  have 
been  solved  to  optimality. 

2.  Approaches  based  on  column  generation  and  set 
partitioning.  In  this  class  Desrochers,  Desro¬ 
siers  and  Solomon  (4)  recently  presented  an 
exact  method  with  the  capability  of  solving 
100-customer  problems.  The  algorithm  is  based 
on  a  combination  of  LP  relaxed  set  covering 
and  column  generation. 

3.  Lagrangean  relaxation  based  methods.  Madsen 
et  al.  (6,7)  have  applied  various  Lagrangean 
relaxation  schemes  to  the  VRPTW  in  order  to 
produce  lower  bounds.  They  are  currently  ca¬ 
pable  of  solving  105-customer  problems  to  op¬ 
timality  using  a  combination  of  Lagrangean  re¬ 
laxation  and  Branch  and  bound. 

4.  An  extension  of  Fisher's  (5)  exact  algorithm 
for  the  classical  vehicle  routing  problem  to 
the  case  with  time  window  constraints.  The  me¬ 
thod  is  based  on  a  K-Tree  relaxation.  Problems 
with  up  to  100  customers  have  been  solved  to 
optimality . 


In  this  paper  an  exact  solution  method  to  the  vehicle  routing 
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problem  with  time  windows  (VRPTW)  will  be  presented.  The  method 
is  from  class  3  and  is  based  on  a  special  version  of  Lagrangean 
relaxation  in  which  variables  are  splitted  into  multiple  copies. 
The  purpose  is  to  form  a  problem  that  can  be  separated  into  a 
number  of  subproblems  with  known  usable  structures. 

The  VRPTW  is  split  into  two  types  of  subproblems.  A  semiassign¬ 
ment  problem  and  a  series  of  shortest  path  problems  with  time 
windows,  capacity  constraints,  and  thee  possibility  of  containing 
negative  cycles.  The  multiplier  updating  in  the  coordinating 
problem  is  done  by  subgradient  optimization.  Branch  and  bound  is 
used  to  close  the  duality  gap. 

PROBLEM  FORMULATION 

The  problem  can  be  defined  by  the  following  parameters: 
m  =  number  of  vehicles 

n  =  number  Of  customers,  index  0  denotes  the  depot 
=  capacity  of  vehicle  k 
qj  =  demand  of  customer  i 
Cjj  =  cost  of  travel  from  customer  i  to  j 
tjj  =  time  for  travel  from  customer  i  to  j 
s,  =  service  time  at  customer  i 
e,  =  earliest  time  allowed  for  starting  de- 
1 ivery  at  customer  i 

Uj  =  latest  time  allowed  for  starting  deli¬ 
very  at  customer  i 

T  =  a  scalar  than  the  travel  time  of  any  feasible  route 

We  are  required  to  assign  each  customer  to  a  vehicle  and  to 
sequence  the  set  of  customers  assigned  to  each  vehicle  so  as  to 
minimize  cost  subject  to  vehicle  capacity  constraints  and  the 
requirement  that  the  time  to  begin  delivery  at  customers  lies  in 
the  time  interval  prescribed.  The  decision  variables  are: 

Xjji  =  1  if  vehicle  k  travels  directly  from  customer  i  to 


customer  j;  0  otherwise 

/i,  =  1  if  customer  i  is  visited  by  vehicle  k;  0  otherwise 
t,  =  the  time  to  begin  delivery  at  customer  i 

t,,j  =  departure  time  of  vehicle  k  from  the  depot 
t,,j  =  arrival  time  of  vehicle  k  at  the  depot 

In  words  the  problem  can  now  be  formulated  in  the  following  way: 

1.  Minimize  the  total  travel  costs 

2.  If  route  k  visits  a  point  it  has  to  leave  the  point  again 

3.  Each  route  originates  and  terminates  at  the  depot 

4.  The  time  to  begin  delivery  at  the  customer  shall  be 
within  the  limits  of  the  time  window.  The  same  ap¬ 
plies  for  the  depot. 

5.  If  a  vehicle  travels  directly  from  i  to  j  then  t. 
should  be  compatible  to  t, . 

6.  Each  route's  demand  is  within  the  capacity  limit  of 
the  vehicle  serving  the  route. 

7.  Each  customer  is  visited  exactly  once. 

If  the  connection  between  x,j|  and  y|^  is  relaxed  the  problem  is 
for  example  decoupled  in  a  semiassignment  problem  (containing  the 
y^j's)  and  m  shortest  path  problems  with  time  windows  and  capacity 
constraints  (SPTWCC)  (containing  x,jj,  t,,  tj,j,  and  tj,j).  The 
semiassignment  problem  's  easy  to  solve  by  inspection.  The  SPTWCC 
is  a  difficult  and  time  consuming  problem  to  solve.  It  is  done  by 
generalizing  Desrochers  and  Soumis'  Generalized  Permanent 
Labeling  Algorithm  (2)  by  extending  the  three  labels  by  a  fourth 
one.  The  solution  method  is  based  on  dynamic  programming.  The 
solution  of  SPTWCC  may  contain  negative  cycles  limited  by  the 
time  windows.  The  coordinating  problem  arising  from  the  relaxa¬ 
tion  is  solved  iteratively  by  subgradient  optimization.  Due  to 
the  integrality  of  x  and  y  there  may  be  a  gap  so  that  only  a 
lower  bound  for  the  VRPTW  is  obtained.  If  a  gap  is  occuring 
branch  and  bound  is  used  by  fixing  a  y  to  one  or  zero.  Then  the 
subproblems  are  solved  again. 
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COMPUTATIONAL  RESULTS 

The  100-point  test  problems  developed  by  Solomon  (8)  were  used  as 
benchmark  problems.  Furthermore  a  31-point  problem  (a  reduced 
version  of  a  Solomon  problem)  and  a  105-point  problem  (a 
combination  of  two  Solomon  problems)  were  used.  Also  subsets  of 
the  Solomon  problems  consisting  of  25  and  50  points  were  used. 
We  solved  one  105  point  problem,  3  100  point  problems,  10  50 
point  problems,  one  31  point  problem,  and  22  25  point  problems  to 
optimality.  The  clustered  problems  were  the  easiest  to  solve, 
while  random  problems  and  mixed  random  and  clustered  problems 
were  more  diffucult.  If  the  time  windows  were  too  wide  the  SPTWCC 
algorithm  failed.  This  algorithm  has  a  complexity  depending  on 
the  square  of  the  vehicle  capacity  and  the  square  of  the  sum  of 
the  time  window  widths.  The  column  generation  algorithm  mentioned 
under  class  2  seems  at  present  to  be  more  effective.  However 
there  are  some  of  the  testproblems  which  can  only  be  solved  by 
the  column  generation  algorithm,  and  some  which  can  only  be 
solved  by  the  Lagrangean  algorithm,  and  some  which  can  not  be 
solved  at  all.  Therefore  more  research  is  needed  within  this 
subject  area. 
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1. The  Quadratic  assignment  problem 

The  QAP  arises  in  facilities  location  and  layout  problems  when  for  exemple,  n  facilities 
are  to  be  assigned  to  n  sites  and  when  the  interactions  between  the  facilities  depend  upon 
their  location.  It  can  be  formulated  as  follows,  if  F=(fij)  is  the  matrix  of  flow  between 
facilities  i  and  j,  D=(d|cl)  the  matrix  of  distances  between  locations  k  and  1: 

find  a  permutation  p  of  the  set  N  =  ( 1, 2 . n| 

which  minimizes  the  global  cost  function. 

min  Cost  (p)=  J  ^  fij  <lp(i)p(j) 

>  J 

The  QAP,  known  to  be  NP-complete  in  complexity,  has  shown  itself  to  be  a  very 
difficult  problem  computationally  .We  will  show  that  only  problems  of  moderate  size 
(n  £  20)  can  be  solved  exactly. 

2. New  exact  algorithms 

There  exists  two  approaches  to  solve  exactly  the  QAP.  The  first  one  which  consits  to 
refoimulale  the  problem  as  a  linear  mixed  integer  program  and  to  solve  it  by  cutting  plane 
methods,  has  not  been  very  successful  in  the  past.  It  manages  to  solve  exactly  only 
problems  of  stze  up  to  eight  (Kaufman  and  Broeckx[  18],  Balas  and  Mazzola[2],  Bazaiaa 
and  Sherali(3]).  The  second  one  is  based  upon  the  concept  of  Branch  and  Bound  (B&B) 
enumeration. 

This  approach  yields  better  results  so  we  will  present  the  last  recent  B&B  algorithms. 
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2.1.Lower  bounds 

To  solve  exactly  QAPs  the  computation  of  the  lower  bound  represents  one  of  the  main 
difficulties.  Either  the  bound  is  too  loose  and  the  number  of  B&B  nodes  becomes  too 
high,  either  the  computationaitime  to  bound  one  node  is  prohibitive. 

We  will  show  that  recent  lower  bounds  based  upon: 

•  eigenvalue  approach  (introduced  by  Finlce.developped  by  Rendl  ( 30]) 

-  equivalent  reformulations  of  the  problem  (Carratesi  and  Malucelli  [11,12]) 
-reduction  approach  (introduced  by  Builcard  [9]  and  Roucairol(31,32] , 
developped  recently  by  Pardalosf  14]), 

improve  of  course,  the  results  obtained  with  the  oldest  and  commonly  used  bound,  the 
Gilmore-Lawler  bound(17, 20]  which  is  based  on  the  notion  of  ranked  product 
But,  this  improvement  remains  low,  and  the  average  relative  error  (in  comparison  with 
the  best  solution)  computed  at  a  node  of  the  B&B  tree  is  quite  important  and  decreases 
slowly  when  considered  at  a  node  of  higher  level. 

a  =  Ameliofaiive  Rate  a  (New  hound  -  GLB)  /  (Best  Value  -  GLB) 


BuuiKb : 
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TaUel.  Ameikoaiive  rates  of  oiiicr  hounds  in  oomiauison  with  CH^ 

Until  now,  it  is  very  difficult  to  measure  the  effectiveness  of  these  lower  bound  improvements  in  a 
B&B  algorithm;  for  most  of  (hem  the  results  have  been  announced  in  a  forthcoming  paper... 

2.2.B ranching  scheme  and  reduction  tests 


As  it  is  very  difficult  to  obtain  efficiently  good  lower  bounds,  an  other  approach  consists  to  still  use 
GLB  bound  but  to  concentrate  the  effon  on  an  other  way  to  reduce  enumeration.  Thus,  we  have 
proposed  [23]  a  new  exact  algorithm  which  has  been  able  to  solve,  for  the  first  time  exactly, 
problems  of  size  up  to  twenty  in  quite  a  reasonable  time.  The  idea  is  to  define  a  new  branching 
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scheme  together  with  appropriate  branching  rules  and  reduction  tests.  We  will  first  describe  one  of 
the  reduction  test;  the  symmetry  test. 

We  use  symmetric  properties  of  the  real  word  application,  like  symmetries  in  the  implementation 
sites.  On  Nugent's  problems  [27],  for  instance,  the  sites  are  on  a  grid  and  the  distances  are 
rectangular  ones.  Therefore,  symmetrica]  equivalent  solutions  go  by  group  of  four  and  will  have  the 
same  cost. 

B  a" 

D _ C_ 

Figure  I . 

Our  B&B  method  will  not  create,  study  and  bound  equivalent  nodes  with  symmetrical  solutions  in 
different  branches  of  the  tree.  In  order  to  test  quickly  and  efficiently  the  property  of  symmetry 
between  two  sites  ,  we  will  give  the  following  definition  : 

Two  sites  i  and  j  are  symmetrically  equivalent  it' 

1  .i  and  j  belong  to  the  same  equivalence  class; 

their  vector  of  distances  with  other  available  sites.where  the  components  are  ranked 
increasingly  are  equal. 

2.their  vector  of  repartition  in  the  different  equivalence  classes  of  all  available  sites  being  at  a 
distance  d  are  equal . 

This  test  is  easily  computed  and  will  produce  an  important  decrease  of  the  size  of  the  B&B  to  search. 
As  we  will  show  further,  heuristics  are  able  to  provide  good  solutions  (with  a  cost  close  to  the  value 
of  the  optimal  solution).  Based  upon  theses  remarks,  our  B&B  algorithm  uses  a  depth  first  search 
strategy  to  examine  the  nodes  which  probably  belong  to  the  critical  tree  (nodes  whose  evaluation  is 
lower  than  the  value  of  the  optimal  solution).  It  also  uses  at  a  B&B  node  a  reduction  test  based  upon 
the  search  gap  (the  difference  between  the  value  of  the  best  known  solution  and  the  value  of  the 
lower  bound  of  solutions  belonging  to  this  node);  each  assignment  of  a  facility  to  a  site  which  has  a 
cost  at  least  greater  than  the  gap  is  forbidden. 

The  branching  scheme  is  polytomic;  the  facilty  with  the  higher  number  of  foihidden  assignments  is 
placed  on  all  available  sites.. 

2.3.Results  of  sequential  and  parallel  B&B  algorithms 

Firstly,  we  present  the  results  obtained  by  the  fastest  sequential  B&B  codes  available;  Burkard  and 
Derigs  [9],  Pardalos  ( 14],  Mautor  and  Roucairol[23]. 

The  test  data  used  are  a  classical  benchmark  for  QAP.  the  first  ones  as  the  second,  assign  facilities  on 
a  grid.  As  the  distance  between  two  adjacent  sites  is  equal  to  one.  these  problems  have  a  very  special 
structure  and  are  not  representative  of  a  general  QAP  (as  pointed  out  at  the  QAFs  day  organized  in 
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Bologne,  September  92).  But.  waiting  a  QAP  test  problems  library,  it  is  the  only  present  way  to 
compare  results.  Our  algorithm  is  the  fastest  and  obtains  the  best  results  even  on  problems  on  which 
symmetries  cannot  be  pointed  out  (Nugent  7.  Elshafei  19). 

The  optimality  of  solution  for  famous  problem  of  size  less  than  20  (size  19-Eishafei  [IS].,  Size  20- 


Armour  and  Bu 

ffa  ( 1  ].)  is  proved- 

ProMem 

Nugent 

Nugent 

Nugent 

Nugent 

Elshafei 

Annour 

Size 

8 

12 

l.S 

16 

19 

20 

Sites  plant 

Grid 

Grid 

Grid 

Square 

- 

Grid 

4*  2 

4*3 

53 

4*4 

5*4 

Best  value 

214 

578 

1150 

1550 

17.212.548 

110.030 

ALGORITHMS 

Maul..  Roue. 

Cray  2 

0.04 

3.4 

•2' 

969 

1.4 

1189 

BufkanV  Cyber  76 

0.26 

46.7 

2947 

BuilcanVCray  2 

0.11 

24.2 

1290 

NEVER 

Pardalos /IBM 

0.29 

34.1 

2005 

3090 

PROVED 

The  recent  developpement  both  of  commercially  available  multiprocessors  and  of  theoritical  analysis 
of  parallelization  of  B&B  algorithms  suggested  that  such  parallel  algorithms  may  be  a  fruitful  area  of 
investigation  for  QAP. 

Only  few  experiments  have  been  made:  two  by  our  team  in  87  (Roucairol  [33],  and  92  Mans. 
Mautor.Roucairol  [22]).  one  by  Pardalos  in  89(14].  They  consist  to  parallelize  on  synchronous 
multiprocessors  machine  a  B&B  algorithm  using  GLB  lower  bound.  Our  last  proposition  is  of 
course,  the  parallelization  of  the  new  B&B  algorithm  presented  in  the  previous  section  [23].  From 
the  point  of  view  of  parallelism,  these  B&B  algorithms  differ  by  the  way  they  allocate  tasks  to  the 
different  processors. 

In  Roucairol's  first  algorithm,  a  global  heap  is  used  to  memorize  the  B&B  nodes  to  be  explored  by 
the  processors.  A  usk  consists  to  explore  a  node  (creation  and  evaluation  of  successor  node);  each 
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processor  will  requite  an  access  to  the  shared  data  structure  to  obtain  such  a  node  or  to  insert  new 
generated  nodes.  In  order  to  limit  global  accesses.  Crouse  and  Pardalos  [14).  build  initialy  several 
heaps  (initiated  with  one  node)  in  the  shared  memory  so  that  each  processor  selects  one  and  can  fully 
explored  it  locally  Since  the  heaps  are  smaller  the  maximum  lime  that  a  processor  could  be  idle 
(waiting  for  a  task)  is  smaller. 

Our  parallelization  uses  a  different  way  to  disuibute  the  work  to  processors  through  the  notion  of 
feeding  tree. 

The  feeding  tree  is  the  uppen  pan  of  the  B&B  tree  deveiopped  until  depth  (or  level )  i;  the  leaves  of 
the  feeding  tree  are  the  roots  of  the  suhtiees  allocated  to  the  processors,  the  tree  stumps. 

The  first  free  processor  initializes  the  left  part  of  the  feeding  tree:  the  leftest  node  at  the  chosen  depth  i 
is  generated  by  successives  braiKhing. 

Only  informations  about  the  path  from  the  root  to  the  last  allocated  nodes  (facilities  assigned  to  reach 
this  depth  and  set  of  the  remaining  available  locations)  have  to  be  kept  in  the  global  memory. 

Each  idle  processor  will  access  to  this  shared  structure,  completes  the  exploration  of  a  B&B  subtree 
which  has.  as  root,  the  feeding  tree  stumps,  and  a  maximal  height  of  (n-i). 


Differents  results  obtained  are  resumed  in  the  tables  below. 


Algorithm 

Machine 

# 

procs 

Nugent 
size  12 

Nugent 
size  15 

Nugent 
size  16 

Ebbafei 
size  19 

Scriabin-Vergin 
size  20 

Optimal 

solution 

578 

1150 

1550 

17,212,548 

110,030 

Burkard 

Cray  2 

1 

WBM 

1290 

not  proved 

not  proved 

not  proved 

Pardalos 

IBM  3000 

1 

2005 

not  proved 

not  proved 

not  proved 

Pardalos 

IBM  3090 

4 

10 

out  of  space 

not  proved 

not  proved 

not  proved 

Mans  and  al. 

Cray  2 

1 

2.68 

109 

969 

1.04 

1189 

Mans  and  al. 

Cray  2 

4 

0.99 

28 

436 

0.68 

560 

Mans  and  al. 

Cray  YMP 

1 

not  tested 

62 

not  tested 

not  tested 

not  tested 

Mans  and  al. 

Cray  YMP 

6 

not  tested 

11 

not  tested 

not  tested 

not  tested 

Table  1:  Comparison  of  Rurming  Times  (in  Seconds). 
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For  our  parallel  algorithm,  the  parameter  i.  called  the  shared  level,  involved  in  the  definition  of  the 
feeding  uee,  allows  to  tune  granularity  and  load  balancing  between  tasks. 

3. New  heuristics 

Of  course,  heuristics  have  been  introduced  to  solve  larger  problems  than  those  solvable  by  exact 
approaches.  The  average  results  of  the  earliest  heuristic  solution  methods: 

<onstructian  methods  ( they  reach  iteratively  by  locating  one  or  mote  facilities  at  each  step  a 
complete  assignment). 

-  approximate  exaa  methods  ( they  reduce  B&B  search  to  obtain  only  good  solutions). 

-  exchange  methods  (they  improve  the  cost  of  a  complete  assignment  by  inierchanging  the 
locations  of  several  facilities). 

were  rather  good  but  on  some  instances,  their  results  could  be  very  bad.  Thus,  the  most  recent 
approximate  methods  use  sophisticated  meta-heuristics  like: 

-  simulated  annealing  (Buricard  and  Rendl  ( 10)..  Lutton  and  Bonomi  [21]..  Wilhem  and  Ward 
[37).) 

-  ubu  search  (Skorin-Kapov  [34)..  Taillard  [.35).), 

-  evolution  strategy  (genetic  algorithms  Brown  el  al.fS)..  Mulhenbein  (24).,  ant  system, 
neural  networks...) 

We  must  say  that  the  best  solutions  is  always  found  by  these  algorithms  on  classical  benchmark  of 
lest  problems  for  size  n  less  than  twenty. 

As  previously  argued,  these  problems  have  a  very  special  structure:  a  lot  of  optimal  or  near  optimal 
solutions  exist.  It  has  been  pointed  out  that  any  approximate  method  which  is  able  to  focus 
sometimes  around  the  current  best  solution  and  explore  new  region,  could  perform  very  well. 

We  will  brietly  report  some  first  experiments  we  made,  in  the  context  of  QAP,  with  massively 
parallel  algorithms  implemented  on  the  Connection  machine  CM-2. 

This  results  are  now  eiKOuraging  but  not  yet  comparable  with  other  results  from  literature.  Mainly 
because  until  now  we  implemented  a  sketch  of  Tabu  search  which  does  not  contains  all  the 
refinements.  We  will  just  indicate  here  the  main  characteristics  of  this  approach. 

Our  idea  was  to  use  more  than  n^  processors  when  n  is  the  size  of  the  QAP  problem.  So,  two  tasks 
will  run  in  parallel:  the  first  one  follows  an  intensification  strategy  whereas  the  second  performs  a 
diversification  phase.  They  will  synchronize  just  to  exchange  an  improved  solution  before  a  new 
iteration. 

4. Conciusion 

Assinment  problem  with  quadratic  objective  function  remains  very  hard  to  solve.  Even  if  we  have 
proved  that  good  branching  scheme  and  branching  rules  drastically  reduce  the  number  of  nodes  of 
the  search  tree,  a  sharp  lower  bound  is  still  needed  to  construct  efficient  B&B  and  thus,  to  solve 
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exactly  the  QAP.  Meta-heuristics  seems  to  give  very  good  results  but  a  QAP  library  of  new  test 

problems  must  be  constructed  to  measure  coirectly  their  effectiveness. 
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1.  Introduction 

Numerically  difficult  linezur  programming  (LP)  problems  have  always  been  a 
challenge  for  developers  of  simplex  based  LP  codes.  Considerable  effort  has  been 
spent  on  improving  the  numerical  behavior  and  the  robustness  of  such  systems. 
Early  it  became  clear  that  the  additive  floating-point  arithmetic  operations  (addi¬ 
tion,  multiplication)  are  the  sources  of  the  problems,  since  their  relative  error  can 
be  any  big  in  the  framework  of  the  finite  eind  normalized  number  representation 
of  floating-point  numbers.  The  approach  to  overcome  the  difficulties  came  from 
two  directions.  The  first  one  tried  to  analyze  and  modify  the  way  these  opera¬ 
tions  are  performed  (c.f.  {OH68j,  (BGHR77],  [Mar89)).  The  second  one  worked 
out  numerical  procedures  that  posses  proven  better  numerical  characteristics.  In 
this  respect  a  very  important  step  wzis  the  introduction  of  the  LU  factorization  of 
the  inverse  of  the  basis.  Several  variants  of  it  have  been  developed  with  different 
pivot  selection  strategies  (c.f.  [BGHR77],  [SS90]). 

2.  Additive  operations  in  the  simplex  method 

Additive  operations  in  the  simplex  method  occur  during  FTRAN and  BTRAN 
((OH68])  type  operations.  This  is  true  for  the  inversion/factorization,  as  well.  In 
FTRAN,  the  operation  is  of  a  =  b-\-c  type,  while  in  BTRAN  the  typical  operation  is 
the  inner  product  of  two  vectors  where  a  number  of  additions  take  place.  Numerical 
problems  may  cause  either  the  creation  of  a  small  number  {white  noise)  in  place 
of  a  zero  (type-1  error),  or  creation  of  a  zero  in  place  of  some  significant  value 
(type-2  error). 

A  simplified  example  for  a  type-1  error  cam  be  a  computation  like  a  =  1  —  3*fc, 
where  a  was  computed  earlier  as  b  ~  1/3.  Here  a  will  not  necessarily  be  zero.  In 
subsequent  transformations  this  white  noise  can  grow  beyond  the  magnitude  of 
the  pivot  tolerance  and  can  cause  serious  problems. 

It  can  be  equally  problematic  if  an  algebraic  nonzero  element  appears  to  be 
zero  (type-2  error).  This  always  happens  if  the  relation  of  6  and  c  is  such  that 
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16|£o  >  |cl,  where  e«  is  the  relative  acciiracy  of  the  number  representation,  and 
a  =  (6  +  c)  —  6  is  to  be  computed.  Here,  we  will  get  a  =  0  instead  of  a  =  c. 
Type-2  error  is  a  consequence  of  the  lack  of  associativity  of  floating  point  additive 
operations. 

To  reduce  the  probability  of  the  occurence  of  type-1  errors,  a  remarkable 
step  was  made  by  OrchaRD-HaYS  [OH68)  who  introduced  the  relative  tolerance 
(Crel  >  0,  small)  for  additive  operations  within  the  simplex  method.  Verbally, 
its  use  can  be  described  as  follows.  If  there  are  too  many  significant  digits  lost 
during  an  additive  operation,  or  in  other  words,  the  magnitude  of  the  result  of  aui 
additive  operation  with  two  operands  is  much  smaller  (£,*1  times  less)  than  the 
larger  magnitude  of  the  operands  then  the  result  is  set  to  zero. 

More  formally,  the  algebraic  expression  a  =  b  +  cis  numerically  evaluated  so 
that  if 

|a| 

f  1 1 1  I  II  ^  £rel«  (“•  1 ) 

max{|6|,|c|} 

then  a  =  0  is  set.  Clearly,  the  choice  of  £rei  is  criticaJ.  Typical  values  for  £,«!,  in 
the  case  of  double  precision  (8-byte)  real  numbers,  are  between  10“’^  and  10“'®. 
If  a  high  level  programming  language  (Fortrzui,  C)  is  used,  checking  (2.1)  is  costly, 
because  of  the  division  or  the  equivalent  multiplication.  At  the  same  time,  its  use 
in  FTRAN  is  unavoidable  unless  some  other  technique  is  applied. 

Benichou  et  al.  ((BGHR??))  selected  an  alternative  method  in  FTRAN.  They 
set  an  a,  value  in  a  transformed  a  vector  to  zero  if  |ai|  <  £u  with  an  appropriate 
£,  where  u  is  the  absolute  value  of  the  element  with  leirgest  magnitude  created 
during  the  a  =  R~'a  operation. 

The  main  purpose  of  the  above  techniques  is  to  help  distinguish  between  white 
noise  and  zero  to  avoid  the  selection  of  a  zero  pivot  in  the  transformed  vector  q. 

During  BTRAN,  inner  products  of  vectors  axe  computed.  Inner  products  axe 
well  known  for  their  bad  numerical  char2u:teristics.  Their  computation  requires  an 
accumulator  collecting  the  component  wise  products.  This  computation  is  very 
much  prone  to  type-2  error.  Using  (2.1)  cannot  prevent  this.  Among  several 
possibilities,  a  relatively  simple  idea  proved  to  be  quite  efficient.  Namely,  am  inner 
product  s  =  p'a  can  be  computed  in  such  a  way  that  the  negative  and  positive 
terms  are  accumulated  separately  and  added  together  at  the  amd,  thus  giving 
chance  to  type-2  error  only  once  per  iimer  product: 

=  ^  Piai  +  (--2) 

»  P.a,>0  p,a,<0 

Cleau-ly,  the  above  amd  some  other  techniques  that  do  not  go  beyond  the  tra- 
ditionad  number  representation  and  operations  will  be  never  be  able  to  guarantee 
the  exact  outcome  of  the  criticad  operations  in  the  simplex  method. 
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3.  An  accurate  floating-point  arithmetic 

The  idea  of  (2.2)  can  be  refined.  K  we  divide  the  possible  range  of  the  p,ai 
values  into  N  consecutive  intervals  (buckets)  defined  by  points  . .  ,iN  then 

the  inner  product  can  be  computed  in  the  following  way; 


^Pia,=  p.a.+  YL  Piai  +  ...+  Y 

i  to<Piat<ti  tN-i<Piai<tN 

' - V - '  ' - V - '  ' - ^ ' 

tucJketi  tucketi  bucket/f 

Here,  the  choice  of  . .  ,tff  can  control  the  generation  and  propagation 

of  error.  Addition  of  values  falling  into  the  same  bucket  will  have  smaller  error. 
Addition  of  the  contents  of  the  buckets  must  be  made  in  increasing  bucket  order  to 
help  reduce  problems  due  to  the  lack  of  associativity.  The  implementation  of  this 
idea  requires  N  accumulators  and  additional  logic,  while  it  still  does  not  guarantee 
full  accuracy  in  all  cases. 

To  restore  associativity,  we  propose  a  further  refinement  of  (3.1).  Now  we 
assume  that  the  exponents  of  the  terms,  intermediate,  and  final  results  on  the  left 
hand  side  of  (3.1)  taken  in  any  order  fall  into  the  [e/,eu]  interval.  In  such  a  case 
one  single  super  register  will  be  able  to  accumulate  the  sum  of  the  terms  with  full 
accuracy  if  it  is  large  enough. 

The  idea  of  our  super  register  (SR)  can  be  sketched  as  follows.  The  length 
L  of  SR  is  defined  in  units  of  4  Bytes.  One  unit  is  reserved  for  overflow.  L  is 
a  parameter  that  can  be  assigned  different  values  in  advance  depending  on  the 
estimated  range  of  the  vaJues.  SR  is  divided  into  the  following  pzirts: 

4  Bytes  2(L-1)  Bytes  2(L-1)  Bytes 


overflow  Integer  bits  Binary  point  Fractional  bits 

area 

Fig.  1.  The  Super  Register 

SR  has  been  designed  to  accumulate  inner  products.  Multiplication  is  per¬ 
formed  by  the  numeric  coprocessor  and  the  result  is  added  to  SR.  For  the  case 
when  SR  turns  out  to  be  too  small,  a  10-byte  long  register  (ER,  not  shown  in 
Fig.  1.)  is  added  to  it.  If  the  value  to  be  added  to  SR  falls  outside  the  range  of 
SR,  ER  will  be  used.  Of  course,  in  such  a  ceise  the  accuracy  will  be  reduced  to 
that  of  the  traditional  representation. 

Clearly,  the  maintenance  and  operations  with  SR  practically  can  not  be 
achieved  in  a  high  level  language.  Therefore,  we  decided  to  use  386/486  based 
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PCs  as  target  machines  and  assembly  language.  We  defined  three  operations  with 
SR:  (1)  aclear  to  clear  SR,  (2)  aadd(a,b)  to  add  the  product  of  a  and  b  to  SR, 
and  (3)  aread(c)  to  retrieve  the  contents  of  SR  amd  store  it  in  c  as  an  8-Byte 
double  precision  number.  These  operations  can  be  activated  by  subroutine  calls 
from  a  Fortran  program.  Details  of  implementation  are  omitted  here. 

.A.t  the  beginning  we  had  to  answer  the  very  important  question:  Will  the 
operations  be  fast  enough  to  madce  the  whole  idea  workable?  To  answer  it,  we 
made  test  runs  on  a  486/33  Mhz  (256  K  on-board  Cache)  PC  with  Z  =  16  to 
compare  the  new  operations  with  the  corresponding  traditioned  ones  using  some 
accuracy  control  technique.  This  value  of  L  resulted  in  a  SR  with  exponent  r2Lnge 
of  «  [-76,76],  and  with  a  resolution  of  throughout  the  whole  rzinge.  The 

following  table  shows  execution  times  in  micro  seconds.  For  the  new  operations 
the  overhead  of  subroutine  calls  is  included,  but  we  also  give  timing  for  subroutine 
calls  with  0,  1,  and  2  parameters. 


Operation 

Time 

aclear 

3.90 

aadd 

5.43 

aread 

8.79 

call-0 

2.75 

caJl-1 

2.74 

call-2 

2.75 

(2.1) 

7.03 

(2.2) 

4.17 

The  above  values  were  obteiined  eis  averages  of  10®  attempts.  Here,  call-0, 
Ccdl-l,  and  caJl-2  mean  the  overhead  of  subroutine  calls  with  0,  1,  2  parameters, 
respectively. 


In  another  series  of  experiments  we  computed  1,000  times  inner  product 
(s  =  p'a)  of  two  vectors  with  1,000  nonzero  elements  each.  The  next  table  gives 
average  times  (in  microseconds)  of  one  s  =  p'a  inner  product. 


Method 

Time 

(2.1) 

7250 

(2.2) 

3410 

SR 

6540 

The  conclusion  of  these  figures  is  that  the  use  of  the  super  register  will  not 
necessarily  slow  down  the  computations  in  the  simplex  method,  especially  if  the 
computer  is  equipped  with  on-board  cache  memory. 
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4.  Accurate  arithmetic  in  the  simplex  method 

Though  the  size  of  one  super  register  is  negligible,  the  memory  requirement 
of  vectors  of  super  registers  can  be  prohibitivly  large.  Therefore,  we  attempted 
to  use  one  single  super  register  throughout  the  simplex  method  and  carry  out  all 
the  critical  floating  point  operations  with  this  register.  This  idea  entails  that  the 
operations  are  to  be  reorganized  to  become  inner  products,  when  possible. 

In  the  case  of  the  LU  form  of  the  basis  (both  L  and  U  aure  stored  colum 
wise)  the  critical  operations  are  inversion  (factorization),  BTRAN,  and  FTRAN. 
In  addition  to  them,  after  inversion  and  before  recomputing  the  basic  solution  the 
right-hand-side  is  to  be  adjusted  to  account  for  nonbasic  variables  at  upper  bound 
and  for  super  basic  variables: 


b  =  b-'^ajUj,  (4.1) 

where  b  is  the  original  right -hand-side  vector,  Oj  is  column  vector  of  variable  Xj,Uj 
is  its  current  nonbasic  value,  and  J  is  the  index  set  of  nonzero  nonbasic  variables. 
This  operation  can  also  be  a  source  of  numerical  problems. 

In  BTRAN  the  operations  are  originally  inner  produts,  therefore  nothing  is 
to  be  reorganized. 

Regarding  the  nature  of  operations,  the  LU  factorization  and  FTRAN  are 
similar,  namely,  they  both  require  FTRANs. 

•After  factorization  the  basis  is  written  asB  =  LU,withB“'  The 

factorization  itself  can  be  carried  out  using  one  SR  with  heavy  logic  but  without 
any  compromise.  The  details  of  it  are  omitted  here. 

During  simplex  iterations  the  situation  is  a  bit  more  complicated.  If  k  ba¬ 
sis  changes  have  been  made  after  factorization  then  =  Ek  . . .  E\U~^  L~^ , 

where  Ei  is  an  elementary  transformation  matrix.  When  this  formation  is  used 
for  FTRAN  to  compute  a  =  B~^a  =  Ek  ■  ■ .  E\U L~^ a  the  transformation  with 
L~^  and  f/“*  can  be  performed  after  one  Euiother  with  the  SR  using  a  dynamic 
linked  list  (DLL).  For  the  transformation  with  Ek  ■  ■  ■  E\,  we  have  two  options:  (1) 
this  operation  is  carried  out  in  some  traditional  way,  e.g.  (2.1),  (2)  at  the  expense 
of  some  additional  administration  and  operations  the  use  of  SR  can  be  extended 
to  this  case. 

The  additional  complications  require  additional  memory,  since  DLL  needs 
integer  arrays  of  size  m  (the  number  of  constraints  in  the  LP  problem)  which  can 
be  too  much  in  certain  cases. 

5.  Computational  experiences 

To  test  the  above  ideas,  we  included  the  super  register  technique  in  our  MILP 
linear  programming  optimizer  (Mar91].  Presently  (November,  1992),  this  imple¬ 
mentation  is  temporary,  not  full-scale,  and  experimental.  “Not  full-scale”  means 
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that  our  first  version  uses  alternative  (1)  of  the  previous  section  which  is  clearly 
the  simplest  and  the  poorest  version.  In  this  way,  the  present  implementation  can 
be  considered  a  hybrid  one. 

In  eiddition  to  this,  it  important  to  note  that  the  code  is  not  tuned  yet, 
Eind  a  number  of  measuring  instructions  are  also  present  infiuencing  the  actual 
performance. 

The  purpose  of  tests  was  twofold:  (1)  to  see  the  extent  of  improvement  in 
numerical  accuracy,  (2)  to  get  an  idea  about  the  speed  of  the  new  technique  in 
comparison  with  the  earlier,  “traditional”  version.  MILP  origineilly  uses  (2.1)  type 
relative  zeroing. 

For  testing  we  used  some  numerically  nontrivial  problems  from  NETLIB 
[Gay85].  A  smaller  representative  of  this  category  of  problems  is  GR0W7. 

We  made  a  special  test  with  GR0W7.  First,  we  used  the  80  bit  extended  accu¬ 
racy  of  the  arithmetic  coprocessor,  and  applied  two  alternative  ways  to  compute 
FTRAN.  The  only  difference  was  the  order  how  the  inner  produts  were  computed. 
At  iteration  371  this  resulted  in  a  different  basis  change.  One  of  them  was  a  wrong 
one  due  to  accumulated  errors.  The  correction  of  this  error  took  several  iterations 
aifter  the  next  factorization.  When  the  SR  technique  was  used  with  L  =  16, 
the  basis  changes  were  not  influenced  by  different  ways  of  FTRAN,  showing  that 
aissociativity  is  better  achieved  with  the  present  environment  of  SR. 

In  the  table  below  we  summarize  our  experiences  with  the  original  and  SR 
version  of  MILP,  where  the  SR  version  is  a  temporary  one  as  described  above.  The 
old  version  of  MILP  uses  the  product  form  of  the  inverse.  Prograunming  language 
is  Fortran??,  prograim  was  compile  by  Lahey  F77L-EM/32  V4.0.  Solution  times 
of  a  486  PC  are  given  seconds. 


Problem 

Version 

Iter. 

Time 

Remark 

GROW? 

Old 

526 

30.04 

Minor  numericad  troubles 

GROW? 

SR 

538 

40.59 

— 

GROW15 

Old 

2625 

340.30 

*  Num.  troubles,  solution  abandoned 

GROW15 

SR 

893 

155.16 

One  tiny  corrected  inaccuracy 

GROW22 

Old 

1753 

397.66 

*  Serious  troubles,  solution  abandoned 

GROW22 

SR 

1400 

446.71 

Few  small  corrected  inaccuracies 

STAIR 

Old 

752 

127.81 

Opt.  sol.  achieved  after  numericad  troubles 

STAIR 

SR 

543 

123.08 

— 

PILOT4 

Old 

775 

107.77 

*  Num.  troubles,  solution  abandoned 

PILOT4 

SR 

1049 

267.54 

— 

The  occasional  minor  troubles  with  the  SR  version  mean  that  after  factor¬ 
ization  very  small  deterioration  in  the  objective  vadue  or  feasibility  might  have 


409 


occured.  The  reason  for  that  is  that  operation  (4.1)  is  not  performed  with  SR  yet. 

The  brief  message  of  the  above  table  is  that  the  iteration  speed  does  not  slow 
down  seriously  by  using  SR,  while  the  accuracy  expectations  tend  to  be  fulfilled.  In 
the  “non-temporary”  version  the  speed  of  factorization  wdll  be  doubled,  while  the 
implementation  of  additional  refinements  will  have  a  slight  counter-effect.  After 
all,  the  complete  implementation  of  the  idea  of  SR  will  tell  us  the  final  answer  to 
its  applicability  and  usefulness. 

It  is  worth  mentioning  here  that  we  tested  SR  also  in  the  framework  of  our 
interior  point  algorithm.  The  algorithm  is  of  primal  affine  scaling  type.  SR  was 
used  in  factorization  only.  We  observed  an  increase  in  accuracy  of  half  to  2  orders 
of  magnitude  on  problems  GROW".  GROW22.  and  PILOT4. 
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1.  Introduction 

Given  n  ittma  and  m  unita,  the  penalty,  pij,  and  the  rejouree  requirement,  rij, 
corresponding  to  the  assignment  of  item  j  to  unit  t  (j  =  =  !,•••  ,Tn), 

and  the  amount  of  resource  a,-  available  at  unit  *  (»  =  l,--',Tn),  the  Bottleneck 
Generalized  Aaaignment  Problem  (BGAP)  is  to  assign  each  item  to  one  unit  so  that 
the  total  resource  requirement  for  any  unit  does  not  exceed  its  availability  and  the 
m2odmum  penalty  incurred  is  minimized.  We  will  assume  in  the  following  that 
all  numerical  input  data  (p,-/,  rtj,  a,-)  are  non-negative  integers.  By  introducing 
binary  variables  Xij,  with 


r  1  if  item  j  is  assigned  to  unit  t; 
1 0  otherwise, 


the  problem  can  be  formulated  as 


minimize  z  = 


(1) 

n 

j=l 

m 

ie  M  =  {l,...,m}; 

(2) 

tsl 

j  eN  = 

(3) 

ieM,  j  eN. 

(4) 

subject  to 
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The  problem  has  applications  in  the  fields  of  scheduling  and  allocation,  among 
others.  Suppose  for  example  that  eaeh  of  n  urban  areas  has  to  be  served  by  one  of 
m  emergency  centres.  Let  pij  be  the  travel  time  between  centre  i  and  area  j,  rij 
the  expected  workload  for  centre  i  if  area  j  is  allocated  to  it,  and  a,-  the  maximum 
workload  that  centre  t  can  support.  The  solution  to  BGAP  will  then  give  the 
feasible  solution  minimizing  the  worst-case  travel  time  between  an  area  and  the 
associated  centre. 

BGAP  is  the  min-max  version  of  the  well-known  (min-sum)  Generalized  As¬ 
signment  Problem  (GAP),  given  by 

m  n 

minimize  EE 
.=1  >=1 

subject  to  (2),  (3),  (4). 

It  is  known  that  GAP  is  A'^P-hard  in  the  strong  sense,  since  even  its  feasibility 
question  is  so  (see,  e.g.,  Martello  &  Toth  (1990),  Ch.7).  Hence  the  same  results 
apply  to  BGAP. 

Severed  contributions  to  the  solution  of  GAP  can  be  found  in  the  literature 
(Ross  iz  Soland(1975),  MarteUo  &  Toth(1981,1990),  Fisher,  Jaikumar  &  Vam 
Wassenhove(1986),  Jornsten  &  Nasberg(1986),  Guignard  &  Rosenwein  (1989), 
among  others).  Mathematical  models  and  a  technique  for  transforming  BGAP 
into  GAP  have  been  given  by  Mazzola  &  Neebe(1988);  to  our  knowledge  no  other 
result  has  been  published  in  the  literature. 

In  the  next  section  we  introduce  lower  bounds  for  the  problem,  which  require 
the  solution  of  bottleneck  knapsack  problems.  Approximate  algorithms  axe  exam¬ 
ined  in  Section  3.  In  Section  4  we  describe  a  branch- and- bound  algorithm  for  exact 
solution  of  the  problem,  and  in  Section  5  examine  its  computationed  behaviour. 


2.  Relaxations  and  Lower  Bounds 

We  consider  lower  bounds  obtained  by  relaxing  constraints  (2),  either  directly  (by 
decreasing  the  resource  requirements)  or  through  surrogate  techniques.  In  any 
case,  we  never  allow  an  item  j  to  be  assigned  to  a  unit  i  if,  in  the  non-relaxed 
instance,  r,'j  >  a,-. 

Relaxation  of  the  Resource  Requirements 

An  immediate  lower  bound  can  be  obtained  by  eliminating  constraints  (2).  It  is 
then  evident  that  the  resulting  problem  can  be  exactly  solved,  in  0(nm)  time,  by 
determinig 

*i(i)  =  arg  inin{p,-,-  :  r,-,-  <  o,}  {j  €  N) 

t 
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and  computing  the  lower  bound  value 

La  = 

If  the  corresponding  solution  (obtained  by  assigning  each  item  j  to  unit  ti(j)) 
satisfies  constraints  (2),  then  this  is  clearly  the  optimum.  Otherwise,  a  better 
bound  Li  can  be  obtained  by  imposing  one  of  the  violated  constraints  (as  shown 
in  Martello  &  Toth  (1991)).  The  time  complexity  for  the  computation  of  La  and 
Li  is  0(nm). 

Surrogate  Relaxation 

For  a  given  vector  (tt,)  of  non-negative  multipliers,  we  define  the  surrogate  relax¬ 
ation  of  BGAP,  S{BGAP,7r),  as 

minimize  i(jr)  =  max{p,jz,-,} 
i.i 

m  n  m 

subject  to  ^  TT,-  ^  rijXij  <  7r,o,-, 

i=l  j=i  i=l 

(3),  (4). 

For  any  non-negative  vector  (ff,-), 

Lziir)  =  2(7r) 

is  then  a  lower  bound  for  BGAP.  It  is  proved  in  Martello  &  Toth  (1991)  that, 
for  any  vector  (tt,-)  of  multipliers,  lower  bound  Lz^ir)  can  be  computed  in  0(nm) 
time. 

3.  Approximation  Algorithms 

A  fe2isible  solution  to  BGAP  of  value  not  greater  than  a  given  threshold  d  can 
be  heuristically  found  through  a  procedure  which  considers  the  items  according  to 
decreasing  values  of  the  difference  between  second  smallest  and  smallest  resource 
requirement  for  a  feasible  assignment,  and  assigns  the  considered  item  to  the  unit 
having  the  smallest  resource  reqwrement.  If  an  item  is  found  for  which  no  feasible 
assignment  is  possible,  the  procedure  returns  no  solution;  otherwise  it  returns 
the  feasible  solution  found.  This  procedure,  which  can  be  implemented  to  run 
in  0(nm  log  m  +  n^)  time,  can  be  used  to  determine  an  approximate  solution  to 
BGAP  by  searching  for  the  lowest  value  d  for  which  a  featsible  solution  is  returned. 
If  this  is  done  through  binary  sesirch,  the  overall  time  complexity  of  the  resulting 
approximation  algorithm  is  0(nm  log  m-h/3(nm-hn^ )),  where  /3  denotes  the  number 
of  bits  required  to  encode  max,',j{p,-j}. 

Other  approximation  algorithms  are  described  in  Martello  &  Toth  (1991). 
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4.  A  Branch-and-Bound  Algorithm 

The  restilts  of  the  previous  sections  have  been  used  to  obtain  a  branch-and-bound 
algorithm  for  the  exact  solution  of  BGAP. 


Branching  Scheme 

The  algorithm  consists  of  a  depth-first  search  in  which,  at  each  level,  an  item  j* 
is  selected  for  branching  and  assigned,  in  turn,  to  all  feasible  units. 

The  branching  item  is  selected  according  to  the  following  criterion.  Let  z 
denote  the  best  incumbent  solution  value,  L  the  current  lower  bound  value,  a,- 
(i  =  1, . . . ,  m)  the  amount  of  resource  currently  available  for  unit  t,  and  U  the  set  of 
currently  unassigned  items.  For  each  j  e  U,  Mj  =  {i  e  M  :  r,-,-  <  o,-  and  pij  <  L} 
is  the  set  of  units  to  which  j  can  be  assigned  without  increasing  the  lower  bound. 
Hence 


Ti  = 


|M,-| 


represents  the  average  percentage  resource  requirement  of  item  j,  while 

Sj  =  min  it  €  Mj}  -  min{r,-j-  :  t  6  Afj} 

is  the  minimum  additional  resource  requirement  if  item  j  is  not  assigned  to  the 
unit  with  minimum  requirement.  Since  the  higher  Vj  or  Sj  ,  the  more  critical  is 
item  J,  the  branching  item  is  selected  through 


j*  =  arg  -H  5,)}. 


Now  let  Af;  =  {i  €  Af  :  r.-y  <  Uj  and  p.-y  <  z}  denote  the  set  of  feasible  units 
for  item  j  €  U.  |Mj«  |  son  nodes  are  generated  by  assigning  j*  to  all  i  6 
according  to  increasing  values  of  r.-y* ,  and  the  search  is  resumed  from  the  first  of 
these  nodes. 


Fathoming  Decision  Nodes 

Consider  a  decision  node  generated,  say,  by  assigning  item  ja  €  U  to  unit  i^. 
Before  computing  the  corresponding  lower  bound  value,  the  following  dominance 
criterion  is  applied.  The  node  can  be  fathomed  if  there  exists  an  item  ^  U, 
currently  assigned  to  a  unit  u  ^  tg,  such  that  by  interchanging  the  assignments 
the  current  lower  bound  and  resource  requirements  do  not  increase. 
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Initialization  Phase 

At  the  root  node  of  the  branch-decision  tree,  a  lower  bound  L*  on  the  optimal 
solution  value  is  first  computed.  The  approximation  procedure  of  Section  3  is 
then  applied,  obtaining  a  first  incumbent  solution  of  value  z.  If  z  >  £*,  the  first 
branching  item  is  determined,  and  the  enumeration  process  begins. 

5.  Computational  Experiments 

The  branch-and-bound  algorithm  of  the  previous  section  wzis  coded  in  Fortran  and 
computationally  tested  on  a  Digital  VAX  station  3100. 

The  computational  experiments  were  performed  on  four  classes  of  randomly- 
generated  problems.  Table  1  gives,  for  different  values  of  n  and  m,  the  average 
number  of  decision  nodes  and  the  average  CPU  times  (expressed  in  seconds) 
computed  over  ten  problem  instances.  For  each  instance,  the  execution  was  halted 
zis  soon  as  the  number  of  nodes  reached  10®.  For  such  cases,  we  give  (in  brackets) 
the  number  of  solved  problems  and  compute  the  average  values  over  them. 

Class  (1)  was  introduced  for  GAP  by  Ross  &  Soland(1975): 

(1)  r,j  uniformly  random  in  [5,25],  t  €  Af ,  j  G  V, 

Pij  uniformly  random  in  (10,50],  i  6  Af ,  j  &  N, 
ai  =  9(n/m)  -f-  0.4 maxt€Af{Ei:,-,0)=ib  ^ 

The  results  show  that  the  problems  of  this  class  are  very  easy.  Most  of  the  instances 
were  solved  by  the  initialization  phase.  The  computing  time  increases  almost 
linearly  with  both  n  and  m. 

More  difficult  problems  can  be  obtained  by  decreasing  the  o,-  values; 

(2)  r,j  and  pij  <is  for  Class  (1), 

aj=  60%  of  the  value  obtained  for  Class  (1),  t  E  Af. 

The  computational  results  show  indeed  a  considerable  increase  in  the  computing 
times,  especially  for  Tn=5  and  m=10.  Most  of  the  instances  of  this  class  admit  no 
feasible  solution  for  m=2  or  3,  and  very  few  feasible  solutions  for  m=5  or  10. 

For  both  Classes  (1)  and  (2)  the  range  of  the  penzdty  values  is  very  limited. 
In  order  to  test  the  behaviour  of  the  algorithm  when  the  optimal  solution  value 
must  be  found  in  a  larger  rsuige,  we  considered  the  following  class: 

(3)  r,-;  uniformly  rzmdom  in  [1,1000],  t  G  Af ,  j  G  N, 

Pij  uniformly  random  in  [1,1000],  t  G  Af,  j  G  IV, 

o;  =  0.6  YljeN  ^ 

The  results  show  a  satisfactory  performance  of  the  algorithm.  For  almost  all  values 
of  m  the  difficulty  of  the  instances  increases  for  n  going  from  10  to  100. 

The  last  class  was  obtained  by  introducing  a  correlation  between  penalties 
and  resource  requirements: 
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T^ble  1.  Exact  solution.  VaxStation  3100  seconds. 

Average  times  /  Average  numbers  of  nodes  over  10  problems. 


Class 

(1) 

Class 

(2) 

Class 

(3) 

Class 

(4) 

m  n 

time 

nodes 

time 

nodes 

time 

nodes 

time 

nodes 

2  10 

Ka 

1 

0.01 

.  0 

0.01 

0 

0.01 

2 

25 

3 

0.01 

0 

0 

0.01 

0 

50 

WM 

0 

0 

■E9 

0 

0.03 

100 

0.03 

0 

mm 

0 

0.03 

0 

0.03 

0 

3  10 

0 

0.07 

13 

0.10 

6 

■SI 

5 

25 

0.03 

3 

0.09 

27 

0.34 

67 

■So 

16 

50 

■I 

0.30 

76 

0.21 

19 

4 

100 

0.04 

■1 

0.05 

0 

3.02 

404 

Mm 

0 

5  10 

0.01 

■1 

0.11 

18 

0.10 

3 

0.10 

7 

25 

mm 

■1 

5.69 

1856 

1.58 

275 

0.60 

129 

SO 

■Eg 

0 

8.35 

1950 

5.92 

1162 

0.04 

0 

100 

0.06 

0 

70.99 

13078 

10.25 

1557 

0.06 

0 

10  10 

MEl 

1 

0.07 

2 

1 

0.04 

1 

25 

■■ 

1.21 

177 

0.76 

74 

2.77 

401 

50 

WM 

■9 

41.96 

5951 

7.46 

786 

367.40 

54926 

100 

0.11 

0 

338.05 

41311 

0.22(9) 

0 

0.12 

0 

Table  2.  Approximate  solution.  VaxStation  3100  seconds. 
Average  times  /  Average  percentage  errors  over  10  problems. 


Class  (1) 

Class 

(2) 

Class 

(3) 

Class 

(4) 

m  n 

time 

%  err. 

time 

%  err. 

time 

%  err. 

time 

%  err. 

2  10 

■9 

0.00 

■■ 

0.01 

0.00 

0.02 

0.00 

25 

0.00 

■■ 

0.01 

0.00 

0.01 

0.00 

50 

■Eg 

0.00 

■■ 

0.00 

0.02 

0.00 

100 

0.03 

0.00 

mm 

0.00 

0.03 

0.00 

0.03 

0.00 

3  10 

0.01 

0.00 

0.05 

0.09 

0.00 

0.06 

0.00 

25 

0.03 

0.00 

0.03 

0.16 

0.12 

0.09 

0.00 

50 

0.02 

0.00 

0.04 

0.12 

0.00 

0.06 

0.00 

100 

0.04 

0.00 

0.04 

^B 

0.41 

0.00 

0.04 

0.00 

5  10 

0.01 

0.00 

0.10 

0.00 

0.08 

0.00 

25 

0.02 

0.00 

■ 

0.31 

0.28 

1.03 

0.19 

0.65 

50 

0.03 

0.00 

0.28 

0.51 

0.04 

0.00 

100 

0.06 

0.00 

mm 

0.26 

0.06 

0.00 

10  10 

0.04 

0.00 

0.06 

0.00 

MEM 

0.04 

MB 

25 

0.03 

0.00 

0.30 

1.54 

■S9 

0.37 

■Eg 

50 

0.06 

0.00 

0.20 

0.00 

0.95 

1.02 

0.22 

MM 

100 

0.11 

0.00 

0.31 

0.00 

0.54 

1.34 

0.00 
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(4)  Tij  uniformly  random  in  (1,800],  i  ^  j  £  N, 

Pij  uniformly  random  in  (1,1000-rij-j,  i  €  M,  j  £  N, 
oi  =  fij/m,  ieM. 

The  computational  results  show  good  behaviour  of  the  algorithm  for  these  prob¬ 
lems  too,  with  comparative  higher  computing  times  for  n  <  50. 

In  Table  2  we  analyse  the  performance  of  the  approximation  algorithm  of 
Section  3.  For  the  same  instances  as  Table  1  we  give  the  average  CPU  time 
(expressed  in  seconds)  and  the  average  percentage  error  100(2®  —  2)/2,  where  2“ 
is  the  approximate  value,  and  2  the  optimal  solution  value  or  the  lower  bound 
value  computed  in  the  initialization  phase  (for  the  instance  not  solved  exactly). 
The  results  show  very  good  behaviour  of  the  approximation  algorithm,  both  for 
running  time  and  the  quality  of  the  solutions  found. 
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NEURAL  NETWORK  ALGORITHMS  FOR  COMBINATORIAL 

OPTIMIZATION  PROBLEMS:  THEORY  AND  EXPERIENCE 

Igor  1.  Melamed 

Many  comblnaLoriai  opt.imizat.lon  problems  are  hard 
t.o  Lreat.,  so  new  approaches  would  be  fruiLTuL  Over 
the  past  years  ,many  attempts  have  been  made  in  order  to 
solve  some  combinatorial  optimization  problems,  such 
as  the  traveling  salesman  and  the  graph  partitioning 
problems,  by  applying  algorithms  based  on  neural 
networks.  It  is  certainly  of  great  Interest  to  try  to 
understand  how  a  parallel  structure,  like  neural 
networks,  can  solve  this  type  of  problems.  It  is 
likely  that  these  kinds  of  studies  could  be  useful  in 
understanding  of  some  mechanisms  of  nervous  system  of 
animals.  Effectiveness  of  this  approach  has  been 
discussed  in  I  1  1. 

In  this  paper,  we  present  a  theoretical  study  and 
computational  experiments  for  solving  the  assignment, 
the  quadratic  assignment,  the  matching,  the  graph 
partitionlng,the  graph  coloring,  the  knapsack  ,  the 
set  covering,  partitioning  and  packing,  the  Weber 
Fermat,  the  transport  and  the  transshipment,  the 
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mokxiinal  flow,  <.Im  t>h«»  lln«aki'  «kn<i  t<h» 

convex  prograimiiinc  problems  using  Hopfleld's 

neural  networks. 

Hopfield's  neural  network  CHNN>  is  an  undirected 
graph  Q=CV,E^  ,  where  V  is  the  set  of  nodes»|l^|=n 

and  E  is  the  set  of  edges  of  Q  ,  vdth 

threshold  u.>  0  and  state  x.  associated  with  each 

i  t 

node  <  neuron  >  t  and  connection  of  strength  T .  . 


associated  with  each  edge  Ci,j>  .T=CT^j'>  is 
nxn  symmetric  matrix,  and  can  be 
T .  The  state  of  a  HNN  in  time  t 


the 
0  or 
is 


n-vector 

There  are  three  types  of  HNN: 

OD  -  neural  networks  with  discrete  time 

and  discrete  states  x  ,  i=I.n 

i 

DC  -  neural  networks  vdth  discrete  time  and  continious 
states  i=I,n  ; 


CC  -  neural  networks  with  continious  time 
and  continious  states. 

The  time  evolution  of  HNN  can  be  described  by  the 
following  systems  of  coupled  nonlinear  equations: 


DD  -  networks 


where 


xXt+I>=//CE  r^yCjCO-u^J,  i=I,n 


0.  3  ^  0 

I.  2  >  0  ,• 


DC  -  networks 

x^(t+l>=g^CY,jT i=I,n  , 

where  g ~  monotonlc  slsmoidal  function  with  O  and  1 
t- asymptotes,  for  example  j^^C2^=0.5Cl-*-ths^.* 

CC  -  networks 


dfc/  /At  = 

i 


HNN  can  be  considered  a  nonlinear  dissipative 

dynamic  systeiOi  This  system  has  the  Liapunov  function 
the  energy  of  the  state.  All  attractors  of  the  HNN 

are  either  fixed  points  or  period  -  two  limit  cycles. 
To  enable  the  HNN  to  compute  a  solution  to  the 
combinatorial  optimization  problem,  the  problem  must 
be  described  by  an  energy  function  in  which  the  lowest 
energy  state  C  only  fixed  point  HNN  >  corresponds  to 
the  best  solution. 


Basic  theoretical  results  of  the  paper  are: 

1.  A  natural  way  of  solving  combinatorial  optimization 
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or  HNN,  for  wliich  an  infinifesimal  breach  of  symmetry 
of  the  matrix  T  causes  appearance  of  attractors 
that  are  large  -  period  cycles,  may  be  constracted. 

2.  If  the  matrix  T  is  positive  semidefinite, 

then  in  DO  and  CD~networlcs  ail  attractors,  that 
are  limit  cycles,  disappear  and  only  fixed-points 
remain.  But  for  many  combinatorial  optimization 
problems  the  matrix  T  is  not  positive  semidefinite. 

3.  By  analyzing  the  eigenvalues  and  corresponding 

subspace  of  the  matrix  T  ,  parameters  in  the 

ertergy  function  for  combinatorial  optimization  problems 
may  be  selected. 

The  mentioned  combinatorial  optimization  problems 
were  states  as  energy  of  state  HNN  minimization 
problems.  Many  of  the  problems  were  studied  with 
different  energy  functions.  Presumably,  it  was 
determined  that  the  neuroalgorithms  find  local  optimal 
solutions  for  all  the  problems  studied.A  more  detailed 
computational  experiment  was  carried  out  for  the 
linear  and  quadratic  assignment,  matching,  graph 
partitioning,  graph  coloring,  node  covering,  knapsack, 
Weber-Fermat,  linear  programming  problems. 

Due  to  the  fact  that  the  aim  of  experiment  was  the 
study  of  solving  of  combinatorial  optimization 
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pnoblAinB  In  pninGiplA  amall-sizo  ppoblamn  wana  aolvaa. 
Ttie  number  m  of  grapH  nodes,  as  well  as  the  number 
of  variables  in  linear  programmine  problems  didn't 
exceed  10.  The  knapsack  and  the  Weber- Fermat  problems 
were  solved  %^th  up  to  slxe  m  sioo. 

Optimal  solutions  have  been  obtained  in  every 
problem  after  no  more  then  20  attempts  with  different 
initial  states.  Over  85%  of  attempts  achieved  local 
optimal  solutions.  The  number  of  iterations  for  one 
attempt  was  about  m  for  DO  and  CD- networks  and  about. 

50m  for  CC-networks.  But  DD  and  CD-networks  not 
always  obtained  optimal  solutions.  Best  results  were 
achieved  for  the  knapsack  and  the  Weber-Permat 
problems.  The  number  of  iterations  required  to  find  a 
local  optimum  didn't  exceed  three. 

The  traveling  salesman  problems  were  solved  using 
Kohonen's  neural  networks.  The  results  are  very  good. 
The  optimal  solutions  were  obtained  for  ten  problems 
with  SO  nodes  within  2000  iterations. 

So,  the  neuroalgorithms  are  a  new  interesting 
class  of  algorithms  for  combinatorial  optimization 
problems. 

Ill  Ifopfield  J.J.  The  effectiveness  of  analoque  neural 
network  hardware.  Network  v.1,.0’  1,  pp.27-40,  1990 


NUMERICAL  SOLUTION  OP  CERTAIN  DECOMPOSITION  - 
COORDINATION  PROBLEMS  IN  NONLINEAR  PROGRAMMING 
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With  a  certainity  large  scale  computing  of  the  future  will 
certainly  be  to  a  large  extent  parallel  in  one  form  or  another. 
One  way  to  break  a  problem  up  into  smaller  subproblems  which  may 
be  treated  independently  is  the  use  of  decomposition-coordination 
schemes,  i.e.  the  problem  of  the  adjustment  of  aggregated  problem 
in  each  iteration  decomposes  it  into  independent  subproblems  of 
smaller  dimension.  If  the  original  problem  has  a  block  structure 
with  coupling  parameters,  then  these  subproblems  are  formulated 
in  accordance  with  the  blocks.  Computing  requirements  in  several 
applications  ai?eas  are  unlikely  to  be  satisfied  solely  by 
uniprocessors  in  the  future.  Decomposition  of  an  original  problem 
into  smaller  ones  can  greatly  simplify  its  solution  on  parallel 
computers  of  various  architectures. 

Numerical  solution_  of  certain  decomposition-coordination 
problem  in  convex  programming  involves  often  the  solution  of 
systems  of  nonlinear  equations  or  minimization  problems  to  obtain 
proper  values  for  coojcdination  parameters.  The  decomposition- 
coordination  problem  has  some  specific  features: 

-  the  user  has  his  disposal  only  values  of  functions; 

-  the  evaluation  of  function  values  includes,  as  a  rule, 
the  solution  of  certain  subproblems  and  therefore  it  can 
be  accompeuiied  with  a  great  computational  effort; 

-  frequently  these  functions  are  continuous  but  not  neces- 
sau’ily  differentiable  at  all  points  of  the  region  under 
consideration  that  causes  additional  difficulties. 

Hence  to  obtain  a  method  that  is  robust,  stable  and  compu¬ 
tationally  convenient  and  efficient,  in  practice,  this  is  far 
from  a  trivial  task.  In  other  words  one  principal  problem  is  the 
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choice  of  a  good  starting  point,  the  second  problem  is  how  to 
perform  when  the  Jacobi  matrix  is  singular  or  ill-conditioned  in 
a  region.  Storage  and  computer  time  economy  is  also  usually 
highly  desirable.  Unfortmately  none  of  existing  methods 
satisfies  simiUtaneously  all  the  above  mentioned  requirements.  In 
this  connection  a  polyalgorithmic  approach  can  be  fruitful,  i.e. 
one  combines  best  features  of  various  methods  and  it  is  required 
that  a  single  numerical  algorithm  involving  in  the  combination  is 
efficient  in  a  stage  of  computational  process  or  for  a  class  of 
problems. 

1. Statement  of  the  problem.  Consider 
a  system  of  nonlinear  equations 

H(x,p)  =0  (1  ) 
where  H=(H^  , . . .  and  , . . .  while  x=(x^ , . . .  ,x^)’^  is 
to  be  determined  as  a  solution  of  nonlinear  problems 

P^(x^,p)  -  min,  x^  a  r^(p),  (2) 
depending  on  the  parameter  p  and  where  is  the  performance 

index  of  the  i-th  subproblem  and  rj,(p)  c  R  is  its  feasible  re¬ 
gion.  Assuming  that  the  problems  (2)  have  the  solutions  x^=Xj^(p), 

^  m 

we  shall  study  the  problem  for  determining  the  vector  p  €  R 
from  the  equation 

H(x, (p*),p*)  =  0  (3) 
such  kind  of  problems  2Lrise  just  not  seldom  in  convex  programming 
with  a  great  number  of  variables  or  with  a  complicated  structure. 
Further  we  shall  reformulate  the  problem  (3)  in  the  form  more 
suitable  for  mathematics 

F(x)  =  0,  (4) 
where  F(x)  =  H(.,x)  and  x  will  stand  for  the  desired  quantity 
(the  parameter  vector). 

2.  Methods.  In  general,  methods  involving 
derivatives  turn  often  to  be  more  effective,  due  to  the 
additional  information  provided,  than  derivative  free  ones.  For 
decomposition-coordination  problems  only  function  values  are 
available.  For  obtaining  information  about  derivatives  finite- 
difference  approximations  are  needed  to  implement  more  efficient 
methods  in  the  region  where  F(x)  is  sufficinetly  smooth.  In  the 
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latter  case  the  main  amoiant  of  computational  work  is  usually 
spent  in  e7aluating  the  Jacobian-  Sacrificing  accuracy  in  favour 
of  simplicity  one  can  use  rank-1  approximation  to  the  Jacobian 
che  evaluation  of  which  is  comparatively  more  economical.  As 
a  rule  the  total  number  of  iterations  is  then  usually  increased 
and  such  kind  policy  of  economy  is  questionable-  Moreover  a 
serious  disadventage  of  Newton  method,  which  is  in  fact  shared 
with  all  Newton-like  methods,  is  the  possibility  of  divergence  in 
the  cases  that  the  Jacobian  is  (nearly)  singular  or 
ill-conditioned  at  some  iterative  points  since  they  are  based  on 
a  linear  model.  Methods  of  order  p^3  are  taking  adventage,  at 
least,  of  a  quadratic  model  and  therefore  they  are  capable  to 
solve  systems  of  nonlinear  equations  with  the  singular  or 
ill-conditioned  Jacobian.  The  total  cost  of  an  iterative  method 
is  determined  by  the  number  of  iterations  needed  to  achieve  the 
required  accuracy  and  the  cost  of  each  iteration.  In  this  respect 
the  implementation  of  the  iterative  methods  with  the  convergence 
order  higher  than  that  of  the  Newton  method  appears  to  be  promi¬ 
sing  since  high  order  methods  need,  as  a  rule,  for  computing  a 
solution  with  prescribed  accuracy  comparatively  less  iterations 
and  therefore  likely  less  total  arithmetic  for  calculting  the 
function  values  and  its  derivatives  than  method  based  on  a  linear 
model . 

If  a  Bxifficiently  good  initial  estimate  for  the  solution 
is  available  then  one  can  for  purpose  of  economy  to  use  the 
modification  of  the  method  of  tangent  parabolas 

<5) 

^k  "k  -  i^<2yk-i  -  '-k-i-^-i 

which  has  the  asymptotic  convergence  rate  equal  to  3  provided 
the  second  order  derivatives  P”  is  Lipschits  continuous  and  other 
reasonable  assumptions  are  fulfilled  [1].  If  P  has  only  the  first 
derivatives  and  the  corresponding  divided  differences  are 
Lipsohitz  continuous  then  its  convergence  rate  is  at  least  equal 
to  [2].  The  procedure  (5)- (6)  requires  few  information  per 
iteration;  two  values  of  P  and  one  value  of  the  divided 
difference  (except  the  first  iteration). 

The  method  (5)- (6)  can  also  be  interpreted  as  a  limit  case 
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(p-*0)  of  a  parametric  family  of  iterative  methods 

=  'k  -  2*kJ'<''k>  *  I  +  oV^'k”  -  ''('-k” 

Vj^  =  Xj^  -  Xj^P(Xj^),  k=0,1 .  (8) 

where  Aj^  and  are  certain  linear  operators  approx imationg 
(P')”^  and  p  is  a  nonzero  real  parameter  [1]. 

If  the  dimensionality  of  the  problem  is  not  large  then  it 
may  for  some  reason  be  suitable  to  use  the  following  modification 
of  tangent  hyperbolas 

where  =  Xj^  -  ^Bj^P(Xj^)  and 

In  particular,  the  symbols  Aj^,  A^^,  and  Bj^  not  only 

denote  finite  difference  approximations  to  =  [PVXj^)]"^  and 

[P'(Xj^  -  ]~^  respectively  but  they  can  also  express  the 

fact  that  the  corresponding  linear  equations  are  solved  approxi¬ 
mately  [3].  - 

In  addition  to  stability  and  economy  of  the  method  a 
crucial  problem  is  the  choice  of  a  good  starting  point.  A  serious 
defect  of  high  order  methods  is  their  pretentiousness  with 
respect  to  an  initial  guess,  i.e.  the  advantages  of  those  become 
evident  mostly  in  the  close  vicinity  of  a  solution. 

One  of  the  most  effective-  ways  to  guarantee  the  global 
conveigence  or  at  least  greatly  to  expand  the  domain  of 
convergence  of  a  method  is  the  "continuation  strategy".  According 
to  this  strategy  one  first  replaces  P(x)  =  0  by  a  one-parameter 
family  of  problems  G(x,\)  =  0,  A  €  [0,1], such  that  P(x)  =  G(x,1) 
and  the  solution  of  G(x,0)  =  0  is  known  secondly  one  solves  a 
series  of  problems  as  the  parameter  A  is  slowly  varied  by  a 
locally  convergent  iterative  method  using  the  solution  to  the 
previous  problem  as  a  starting  point  for  the  current  problem.  The 
rules  for  changing  the  value  A  can  be  suggested  by  the  physical 
nature  of  the  problem  but  some  standard  mathematical  algorithms 
exist  for  this  as  well.  All  continuation  (homotopy)  methods 
suffer  from  the  disadvantage  that  a  Jacobian  at  some  points  could 
become  singular.  Therefore  the  homotic  strategy  needs  stable 
local  algorithms  to  be  successful  [4,5].  In  this  respect  methods 
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based  on  a  quadratic  model  meet  this  requirement  and  using  them 
in  homotopy  methods  can  be  justified  even  if  the  implementation 
of  high  order  methods  for  solving  some  particular  problems  in  the 
homotopy  methods  is  not  efficient  due  to  a  great  number  of  compu¬ 
tational  effort  involved. 

This  report  suggests  another  approach  to  ensure  the 
convergence  from  a  poor  starting  point.  This  strategy  consists  in 
combining  global  and  local  methods  to  obtain  more  robast  ones. 
This  approach  may  be  more  successful  for  decomposition-coordina¬ 
tion  problems  due  to  their  possible  nonsmoothness  at  some 
iterative  points,  i.e.  the  functions  determining  the  problem  may 
belong  to  almost  differentiable  functions  [8].  The  strategy  of 
the  polyalgor ithmic  procedure  under  consideration  is  use  of  a 
high  order  method  (p>2)  if  it  works  otherwise  to  switch  on  a 
modification  of  the  Newton  method  with  finite-difference  approxi¬ 
mation  of  the  Jacobian  [cf.9]  bearing  in  the  mind  that  the  Newton 
method  requires  a  good  initial  guess  only  for  guaranteering 
quadratic  convergence  but  in  many  cases- it  may  progress  starting 
from  a  poor  initial  estimate  [7],  If  the  Newton-type  method  also 
does  not  work  then  to  switch  on  a  more  slower  but  sure  global 
method  depending  on  smoothness  of  the  problem:  on  a  method  based 
on  the  steepest  direction  provided  involving  functions  are 
differentiable  or  otherwise  on  a  method  using  subgradients  (e.g. 
methods  from  [8]).  After  accomplishing  ^certain  number  of 
iterations  by  the  global  method  one  attempts  to  start  with  a  high 
order  method  once  again. 

In  the  case  of  smooth  problems  one  can  take  as  a  global 
method  the  one 

*k+i  =  *k  - 

where  g(x)  =  [AP(i,h) and.  AP(x,h)  is  a  finite-difference 
approximation  to  F'-  and  h  is  the  step  of  discretization  or  the 
two  parametric  methods  [cf.9] 

Likewise  (10)  and(11)  can  be  used  for  solving  linear  equations  in 
the  inexact  local  methods. 
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3.  Numerical  examples.  It  is  well  known 
that  mathematically  efficient  methods  do  not  necessary  result  in 
efficient  computer  programs.  On  the  other  hand  the  worst  case 
bounds  for  a  method  may  be  too  pessimistic  and,  in  practice,  it 
may,  on  the  average,  to  perform  better  than  expected. 

The  performance  of  the  above  method  (5)-(6),  (7)-{a)  and 
(9)  is  tested  on  a  small  set  of  test  problems  containing  14 
problems  for  systems  of  nonlinear  equations  of  Argonne  National 
Laboratory  plus  Freudenstein  and  Roth  function  and  Box  three-di¬ 
mensional  function  for  nonlinear  least  squares  taken  from  [10]. 
The  methods  under  review  were  compared  with  the  following  methods 
which  were  tested  on  the  same  set  of  test  problems. 

1 .  The  Newton  method. 

2.  The  modified  Newton  method  k=1,2,  —  ). 


3.  The  version  of  the  Newton  method  with  finite-difference 
apprci  X  ima  t  ion . 

4.  The  Bartysh  method  (p=1+'<2’)  [11]. 

5.  The  Kogan  method  [12] 


-1 


•k.i  =  ■'k  -  i'">''k)r  f'(Jk)- 

6.  The  modification  of  the  Kogan  method 


Vk  =  -  jFj^FCXj^) 


X 


k+1 


where  I  denotes  the  identity  mapping- 

7.  Shacham  method  (CONLES)  which  combines  the  Newton  and 
the  Levenberg-Marquart  ones  [13]. 

Further  the  following  nvunei?ation  is  used. 

8.  The  method  (5)-(6). 

9.  The  combination  of  the  method  (5)- (6)  and  Newton's  one. 

10.  The  method  (7)-(8)  with  p=-1 ,  p=-^  and  p=-5. 

All  the  problems  were  run  on  a  EC1060  computer  under  a 
PORTRAN-IV  compiler.  The  calculus  was  performed  in  double 

precision  and  stopped  ^  10”®.  Termination  of  the 

routine  also  occured  when  the  number  of  iterations  exceeded  the 
given  maximum  value  300.  The  table  1  gives  iteration  number  N  at 
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which  the  presigned  accuracy  was  achieved  and  where  denotes 
the  failure  (nonconvergence).  The  table  1  indicates  that  the 
methods  (5)-(6)  and  (7)-(8)  require  a  good  initial  estimate.  But 
when  started  from  an  initial  point,  which  was  close  enoiagh  to  the 
solution  they  would  converge  very  fast.  Besides  they  guaranteed 
almost  the  same  results,  both  in  single  and  double  precision, 
while  the  Newton  method  yielded  worse  results  with  the  single 
precision. 

Table  1 


Method  . 

2 

3 

4 

5 

6 

7 

8 

9 

10  A 

Test 

P=-1 

p=- 

1  n--S 

2  P-  ^ 

1 

3 

3 

3 

3 

2 

2 

3 

2 

2 

2 

2 

2 

O 

C 

32 

k>300 

31 

27 

21 

21 

18 

- 

38 

14 

14 

13 

3 

13 

- 

19 

12 

9 

9 

12 

- 

12 

8 

8 

8 

4 

15 

k>300 

47 

14 

28 

24 

14 

- 

17 

- 

140 

C, 

✓ 

11 

- 

- 

6 

8 

9 

- 

10 

- 

- 

- 

6 

(m=6) 

13 

k>300 

45 

15 

9 

9 

12 

- 

14 

8 

8 

- 

tn 

( 

6 

- 

6 

6 

4 

5 

5 

- 

5 

4 

- 

- 

a 

91 

- 

80 

- 

k>300  - 

6 

- 

92 

- 

- 

9 

4 

6  • 

3 

4 

3 

3 

3 

3 

3~ 

3 

3 

2 

10 

(m=10) 

4 

6 

4 

4 

3 

3 

3 

3 

3 

3 

3 

2 

1 1 

8 

k>300 

6 

12 

7 

- 

8 

- 

6 

- 

- 

- 

12 

15 

k>300 

20 

13 

10 

10 

14 

- 

16 

9 

9 

9 

13 

5 

22 

4 

5 

4 

4 

5 

- 

5 

3 

3 

3 

14 

7 

60 

6 

6 

5 

5 

6 

- 

5 

4 

4 

4 

15 

43 

k>300 

42 

10 

25 

60 

49 

23 

23 

67 

51 

229 

16 

6 

30 

5 

5 

4 

4 

5 

4 

4 

4 

4 

3 

The 

comparison 

of 

the 

columns 

5  and  6 

shows 

that 

the  Kogan 

method  and  its  inexact  version  give  approximately  the  same 
results. 
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The  performance  of  the  hybrid  method  (5)- (6)  which 
combines  the  best  features  of  (5)- (6)  and  the  Newton  method  as 
well  as  method  CONLES  was  slightly  superior  both  in  speed  and 
accuracy  to  the  Newton  method.  These  promising  resiilts  encourage 
to  carry  on  (extend)  the  investigation  of  properties  of  polyalgo- 
rithmic  procedures. 
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Extended  Abstract 


Some  large  scale  LP  relaxations  for  the  graph 
partitioning  problem  and  their  optimal  solutions 

M.  Minoux 
Universite  Paris  6* 


1  Introduction 

The  so-called  Graph  Partitioning  Problem  (GPP)  has  applications  in  many  areas  such 
as  data  analysis  and  clustering,  VLSI  circuit  layout,  block  decomposition  of  large  linear 
systems,  etc  . . . 

The  purpose  of  this  paper  is  to  investigate  a  family  of  large  scale  linear  programming 
relaxations  for  (GPP)  which  extends  the  one  considered  in  Minoux  [4]  and  Minoux  and 
Pinson  (5].  Some  of  the  problems  in  the  family  are  shown  to  be  polynomially  solvable,  in 
which  case  characterizations  of  their  optimal  solutions  axe  given. 

As  an  outcome,  bounds  are  derived  which  may  be  of  use  e.g.  to  validate  approximate 
(heuristic)  solutions,  or  to  implement  Branch  and  Bound  procedures. 

We  consider  here  the  uncc  tstraincd  version  of  the  graph  partitioning  problem  which  may 
be  stated  as  follows.  Suppose  we  are  given  an  undirected  connected  graph  G  =  [X,  17)  with 
node  set  X  (|A|  =  N)  and  ec'ge  set  U  (|t/|  =  A/),  and  an  integer  p  <  N.  For  any  S  C  X, 
we  denote  7(5)  the  total  number  of  edges  in  (/  having  both  endpoints  in  5. 

(GPP)  is  then  to  find  a  partition  of  X  into  p  subsets  5],  52, . . . ,  5p  such  that  the  quantity: 

2  =  liSi)  +  7(5'2)  +  •  ••  +  y{Sp) 


is  maximized. 


'MASI.  Universite  Paris  6,  4  Place  Jussieu,  75005  Paris,  France 
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2  A  family  of  large  scale  (LP)  relaxations 

For  any  integer  q  {N  —  p  +  \  <  q  <  N)  vie  consider  the  large  scale  set  partitioning 
problem  defined  as  follows. 

Suppose  that  the  nonempty  subsets. of  X  are  numbered  1,2, ...,J  (J  =  2^  —  1)  and 
let  denote  the  number  of  subsets  having  cardinality  <  q. 

Let  us  denote  A  =  (a,,)  t-i . s  the  incidence  matrix  of  all  the  subsets  with  cardin2dity 

<  q.  Thus,  for  the  subset  Sj  having  index  number  j  (1  <  J  <  Jq)  dij  =  1  if  node  t  is  an 
element  of  Sj,  tiij  =  0  otherwise. 

To  each  subset  Sj  we  let  correspond  a  binary  variable  Xj  {xj  =  1  if  subset  Sj  is  selected 
in  the  optimum  partition  to  be  found,  =  0  otherwise)  and  a  cost  Cj  defined  by: 

Cj  =  liSj) 

where  'f(Sj)  denotes  the  total  number  of  edges  in  U  having  both  endpoints  in  Sj.  With  this 
notation,  the  large  scale  set  partitioning  problem  (/?[?])  reads: 

Maximize  ^  CjXj 
J=t 

subject  to: 

(R[q])  I  =  1  (1) 

(2) 

/x”6{0,l}''«  (3) 

(1  denotes  the  N  vector  with  all  components  1). 

Since  any  partition  solving  (GPP)  should  composed  only  of  subsets  having  cardinality 

<  yV— p+1,  for  any  q  >  N  —p  +  l  (fi{g])  and  (GPP)  are  equivalent.  Even  for  graphs  having 
moderate  number  of  nodes  (>  15  say)  (iZf?))  will  have  an  enormous  number  of  columns.  So 
for  ^  >  yV  —  p  +  1  we  are  led  to  consider  the  linear  relaxations  (/?[ql))  to  {R[q])  which  are 
simply  obtained  by  replacing  (3)  by; 

>  0  {j  = 

We  note  here  that  for  the  special  case  q  =  N  the  set  partitioning  problem  (fi(yV])  and 
its  relaxation  (^[yV])  were  int;oduced  and  studied  in  Minoux  [4].  In  particular  it  was  shown 
there  that  (/2(y'^)  is  polynomially  solvable  by  means  of  the  “Ellipsoid  Algorithm"  (Khachian 
[3]).  In  Minoux  and  Pinson  [5]  a  generalized  linear  programming  algorithm  (Dantzig  [1])  was 
applied  to  derive  bounds  on  optimal  solution  values  to  (GPP)  and  computational  results  were 
presented. 


432 


3  Polynomially  solvable  cases  and  associated  optimal 
solutions 

The  first  result  concerns  the  special  case  q  =  N.  The  polynomial  solvability  of  (/2[7V])  was 
already  established  in  Minoux  {4],  but  the  proof  there  was  based  on  the  use  of  the  Ellipsoid 
2dgorithm;  thus  the  existence  of  a  purely  combinatorial  algorithm  for  solving  (iZ[A^])  was 
left  open.  Theorem  1  below  shows  that,  at  least  for  some  values  of  p,  the  question  may 
be  smswered  positively.  Given  a  subset  S  C  X  such  that  |5|  >  2,  we  define  5(5]  as  the 
{N  +  1)  X  {N  +  1)  matrix: 

In  :  V 

B[S]=  ...  i  .. 

.P  i  1 

with  V  the  incidence  vector  of  5  in  X. 

Theorem  1  Consider  S*  G  X  |5*|  >  2  such  that: 

7(5*)/(|5*|  -  1)  =  max{7(5)/(15|  -  1)}  (4) 

|S|>J 

and  suppose  that  \S'\>  N  —p+l. 

Then  5(5*}  is  an  optimal  feasible  basis  for  (5(A^|)  and  the  corresponding  optimal  objective 
function  value  (an  upper  bound  to  (GPP))  is: 

The  maximization  problem  (4)  is  a  maximum  ratio  problem  which  can  be  efficiently  solved 
in  polynomial  time,  by  a  purely  combinatorial  algorithm  (through  a  sequence  of  maximum 
flow  problems).  Therefore,  when  the  conditions  of  Theorem  1  apply  (namely  p  >  TV— 15*|+1) 
this  result  leads  to  a  purely  combinatorial  algorithm  for  solving  (5pV])  both  practically  and 
theoretically  more  efficient  than  the  other  previously  known  approeiches  (Ellipsoid  algorithm 
or  generalized  linear  programming). 

The  second  result  below  ic  a  generalization  of  the  previous  one  to  other  problems  in  the 
family  (5(^]),  for  some  values  of  q  in  the  range  (A^  —  p  +  1,  A^j.  Since  the  strength  of  the 
relaxations  (5(9])  improves  for  smaller  values  of  q,  it  is  not  surprising  that  the  resulting 
(upper)  bounds  are  never  worse  (but  usually  better)  than  those  given  by  (5). 

Theorem  2  For  A  >  A*  =  ^  =  max  <7?r— let  ^  be  a  subset  satisfying: 

7(^)  -  A(|5|  -  1)  =  m«{7(5)  -  A(|5|  -  1)}  (6) 


and  (5)  =  q. 


433 


Then  for  q  >  N  —p+ 1,  is  an  optimal  feasible  basis  for  (i?[9])  and  the  corresponding 
optimal  objective  function  vahie  (an  upper  bound  for  (GPP))  is: 


(7) 

Since  there  are  only  finitely  many  subsets  of  X,  (6)  has  only  finitely  many  distinct  optimal 
solutions  when  A  varies  in  the  range  ]A*,+oo(.  Let  denote  those  solutions  with 

cardinality  >  N  —  p  +  I  and  let  |^i|  =  ^i,  |^2|  =  1^,|  =  qt.  Theorem  2  implies  that 

(/il?])  is  polynomi2Jly  solvable  (by  a  purely  combinatorial  algorithm)  for  q  €  ft,  •  •  •  >  9(}- 
Moreover  the  best  possible  (upper)  bound  for  (GPP)  which  may  be  derived  from  (7)  is: 


{N-p)x 


15,1  -  1 


=  (N  —  p)  X  min 
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Optimal  Budgetary  Policies  under  Uncertainty: 
A  Stochastic  Control  Approach 
Extended  Abstract 


Since  the  mid-eighties  the  size  of  the  federal  budget  deficit  has  been  of  much  concern 
to  policy-makers  in  Austria.  By  and  large,  they  now  generally  agree  upon  the  necessity 
of  consolidating  the  federal  budget  to  prevent  a  loss  of  credibility  of  fiscal  policies. 
Nevertheless,  there  are  trade-offs  and  side-effects  associated  with  a  policy  of  gradually 
or  even  suddenly  diminishing  the  budget  deficit.  So  far,  no  quantitative  informations 
are  available  about  the  effects  of  budgetary  measures  on  the  main  objectives  of 
Austrian  economic  policy,  such  as  growth,  full  employment,  price  stability,  and  balance- 
of-payments  equilibrium.  Moreover,  neither  the  intertemporal  trade-offs  nor  the  issue 
of  policy-makers’  limited  information  about  future  events  has  received  serious 
attention  in  the  political  debate  in  Austria  so  far.  The  question  of  how  to  design 
budgetary  policies  under  these  conditions  can  be  seen  as  a  typical  problem  of  the 
theory  of  quantitative  economic  policy.  If  we  are  ready  to  postulate  an  objective 
function  to  be  optimized  by  policy-makers,  we  can  apply  stochastic  control  theory  to 
derite  and  analyse  optimal  budgetary  policies  for  past  or  future  periods. 

In  this  paper  we  determine  optimal  budgetary  policies  for  Austria  using  a  small 
macroeconometric  model.  This  model,  called  FINPOLl,  is  based  on  traditional 
Keynesian  macroeconomic  theory  in  the  sense  of  IS-LM/aggregate  demand-aggregate 
supply  models.  Stochastic  behavioral  equations  for  the  demand  side  of  the  economy 
include  a  consumption  function,  an  investment  function,  an  import  function,  and  an 
interest-rate  equation  as  a  reduced-form  money  market  model.  Prices  are  largely 
determined  by  aggregate  demand  variables.  Disequilibrium  in  the  labor  market,  as 
measured  by  the  excess  of  unemployed  persons  over  vacancies,  is  modeled  to  depend 
upon  the  real  GDP  growth  rate  and  the  rate  of  inflation,  embodying  both  an  Okun’s 
law-type  relation  and  a  rudimentary  Phillips  curve.  The  main  objective  variables  of 
Austrian  economic  policies,  such  as  real  GDP,  the  labor  market  disequilibrium  variable 
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(related  to  the  rate  of  unemployment),  the  rate  of  inflation,  the  balance  of  payments 
and  the  ratio  of  the  federal  net  budget  deficit  to  GDP,  are  related  to  those  Hscal  and 
monetary  policy  instruments  which  are  used  as  control  variables.  Particular  attention  is 
given  to  the  influence  of  variables  of  the  federal  budget  (revenues  and  expenditures)  on 
the  endogenous  variables  of  the  model.  The  model  FINPOLl  is  a  nonlinear  model;  it 
was  estimated  by  ordinary  least  squares  using  annual  data.  Several  tests  and  tentative 
simulation  experiments  indicate  that  its  tracking  capability  is  reasonable  and  that  it 
provides  a  satisfactory  framework  for  analysing  effects  of  budgetary  policies  on  the 
Austrian  economy. 

Next,  we  apply  the  algorithm  OPTCON  to  calculate  optimal  fiscal  policies  for  the 
eighties,  during  which  the  problem  of  rising  federal  budget  deficits  was  most 
pronounced.  OPTCON  has  been  developed  by  Josef  Matulka  and  the  author;  it 
determines  approximately  optimal  control  paths  for  nonlinear  stochastic  dynamic 
systems  under  quadratic  objective  functions.  We  give  a  short  description  of  this 
algorithm  and  discuss  its  strengths  and  present  limitations.  An  objective  function  is 
formulated  which  serves  as  expression  of  the  preferences  of  an  hypothetical  policy¬ 
maker  in  charge  of  the  federal  budget.  Target  paths  and  preference  weights  for  the 
objective  variables  are  determined  in  an  ad-hoc  way  after  some  trial  and  error.  Using 
this  objective  function  and  the  dynamic  constraints  as  given  by  the  equations  of  the 
model  FINPOLl,  we  determine  approximately  optimal  time  patffs  for  the  budgetary 
control  variables  and  the  endogenous  (especially  the  target)  variables  under  the 
assumption  of  complete  information  for  the  policy-maker.  In  addition  to  this 
deterministic  optimization  run,  different  assumptions  about  parameter  uncertainties 
are  introduced  in  order  to  assess  the  influence  of  various  kinds  of  uncertainty  on  the 
design  of  optimal  budgetary  policies.  In  particular,  we  investigate  the  effects  of  making 
several  key  parameters  determining  fiscal  and  monetary  policy  multipliers  uncertain. 
The  results  of  these  different  stochastic  optimum  control  experiments  are  compared  to 
each  others  and  to  the  historical  values  of  the  control  and  endogenous  variables. 
Interpretations  are  given  for  the  effects  of  uncertainty  on  optimal  budgetary  policies 
and  for  the  potential  of  improving  policy  performance  by  using  optimization  techniques 
such  as  stochastic  optimum  control  theory.  We  conclude  that,  contrary  to  popular 
assertions,  there  is  a  considerable  potential  for  welfare  gains  to  be  obtained  by  using 
optimization  methods  for  the  design  of  economic  policies,  even  and  particularly  in  the 
presence  of  uncertainty  about  policy  effects.  Some  suggestions  on  how  to  exploit  that 
potential  for  the  formulation  of  budgetary  policies  in  Austria  in  the  future  are  given. 
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In  the  multiple  depot  vehicle  routing  problem  (MDVRP),  a  fleet  of  heterogeneous  vehicles  with 
known  capacities  originate  from  and  return  to  depots  with  fixed  locations,  and  service  a  given  set 
of  nodes  with  known  demands  for  service.  The  objective  is  to  minimize  the  variable  cost  of 
vehicles  traveling  to  serve  the  nodes.  The  well-known  vehicle  routing  problem  (VRP),  is  a 
special  case  in  which  there  is  a  single  depot.  The  VRP  is  known  to  be  an  P-hard  problem.  Thus, 
computational  effort  for  known  exact  solution  procedures  increases  exponentially  in  problem 
size. 

Cluster-First/Route-Second  heuristic  methods  approach  the  solution  of  multiple  vehicle  problems 
by  assigning  nodes  to  vehicles  in  a  first  phase,  then  determining  the  sequences  for  each  vehicle  in 
a  second  phase.  Seed-setting  methods  are  a  particularly  successful  cluster-first/route-second 
approach  to  the  VRP.  These  methods  utilize  a  set  of  geographical  locations  called  seeds,  one  for 
each  vehicle,  that  model  the  nominal  directions  and  distances  from  a  depot  that  the  vehicles 
travel.  The  seeds  are  used  to  set  parameters  for  a  procedure  that  assigns  the  nodes  to  vehicles 
without  exceeding  their  capacities.  Examples  of  such  parameters  include:  i)  the  straight  line 
distances  from  node  locations  to  the  seeds,  and  ii)the  extra  distances  incurred  if  vehicles 
traveling  to  a  seed  and  back  deviate  to  serve  the  node.  Using  parameters  such  as  these,  a  variety 
of  node  assignment  procedures  have  been  developed,  including  generalized  assignment  integer 
programming  solvers  and  various  heuristics.  As  the  final  step,  a  traveling  salesman  problem 
(TSP)  must  be  solved  for  each  set  of  nodes  associated  with  each  vehicle.  In  the  MDVRP,  the 
process  of  assigning  nodes  to  seeds  (vehicles)  implicitly  assigns  the  nodes  to  depots  as  well. 
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Recent  research  has  proven  successful  in  employing  genetic  search  to  iteratively  seek  seed 
location  patterns  that  generate  good  solutions  to  the  VRP.  Genetic  search  is  a  general  purpose 
heuristic  procedure  that  uses  concepts  of  selection  and  inheritance  to  artificially  evolve  good 
solutions,  within  a  framework  inspired  by  the  genetics  of  biological  systems.  In  a  genetic  search, 
three  items  must  be  suppied  -  a  representation  of  a  candidate  solution  as  an  artihciai 
chromosome,  a  means  of  evaluating  the  hlness  of  a  candidate  solution,  and  recombination 
operators.  In  our  application,  an  artificial  chromosome  consists  of  (x,y)  coordinate  locations  for 
seeds  expressed  in  binary.  Fitness  evaluation  is  through  generali2ed  assignement  integer 
programming  models  and  other  methods  that  assign  nodes  to  seeds,  followed  by  a  traveling 
salesman  heuristic.  Recombination  adjusts  seed  point  locations  through  two-point  crossover. 
With  this  method,  best-known  solutions  to  many  well-known  vehicle  routing  problems  have  been 
generated.  In  this  study,  we  extend  the  use  of  genetic  search  for  seed  setting  to  the  multi-depot 
case.  Several  heuristics  were  developed  for  assigning  the  nodes  to  seed  points.  Exact  and 
heuristic  solvers  for  generalized  assignment  problems  were  developed  and  tested  for  this 
purpose,  as  well  as  greedy  clustering  heuristics.  Local  tour  improvement  operators  were  devised 
to  modify  the  tours  given  by  the  TSP  solver  at  each  iteration.  Both  network  optimization  and 
r-opt  types  of  operators  were  effective  in  local  tour  improvement.  In  some  of  the  experiments, 
several  node  assignment  heuristics  were  run  at  each  iteration,  and  the  best  method  was  used  in 
the  genetic  selection  process  within  the  population.  In  essence,  this  provides  several  alternative 
evaluation  methods  evaluation  for  each  population  member,  a  new  technique  that  we  call 
multiple  sharing  evaluation  functions. 

The  methods  were  experimentally  tested  on  suites  of  test  problems  that  are  extensions  of 
problems  from  the  literature,  as  well  as  on  randomly  generated  problems.  Problems  with  up  to 
250  nodes  and  25  vehicles  were  tested  on  a  DEC  5000/133  desktop  workstation  computer.  A 
visualization  interface  and  statistical  measures  show  that  efhceht  routes  are  consistently 
generated  by  the  procedure.  For  problems  with  small  numbers  of  depots,  rough  comparisons 
with  single  depot  solutions  can  be  obtained.  We  determined  that  our  procedure  is  competive 
with  known  VRP  solvers  for  such  problems.  Although  the  MDVRP  is  of  high  practical 
importance,  heuristics  previously  developed  specifically  for  the  problem  are  unsophisticated,  and 
our  new  methods  produce  much  better  results. 


(*)  Contact  author 
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1  Introduction 

The  orJer  preserving  assignment  problem  is  the  following  variation  of  the  classical  assignment  (or  marriage) 
problem:  given  an  ordered  list  of  n  “items” ,  p  possible  “positions"  and  integer  “profits”  Cij  for  the  assignment 
of  item  t  to  position  j  find  a  profit-optimal  assignment  of  items  to  positions  that  uses  at  most  p  contiguous 
positions  (starting  with  position  1)  and  that  preserves  the  order  of  the  list  of  items,  i.e.  the  highest  ranked 
item  that  gets  assigned  to  a  position  is  assigned  to  position  1  and  so  forth.  Note,  however,  that  we  do 
not  require  that  any  item  be  assigned  to  a  particular  position.  We  are  merely  requiring  that  (the  straight) 
“lines  do  not  cross”.  Figure  1  shows  an  example  of  an  order  preserving  assignment  (OP A)  for  n  =  4  and 
p  =  3  that  uses  two  positions.  We  assume  that  the  order  of  the  items  agrees  with  their  indexing,  i.e.  item  1 
is  the  highest  ranked  item,  item  2  the  second  highest  and  so  on  and,  of  course,  that  the  profits  are  additive. 

ITEMS  POSmONS 
1 

2 

3 


and  p=3. 

Since  to  every  combination  of  j  items  out  of  n  distinct  items  there  corresponds  exactly  one  order 
preserving  assignment,  it  follows  that  there  are  exactly  Cj)  OP  As  for  n  items  and  at  most  p  positions. 
This  is  a  substantially  smaller  number  than  the  number  of  possible  assignments  (without  regard  to  order  and 
contiguity)  of  which  there  are  exactly  However,  the  number  of  OP  As  for  reasonably  sized  n 

and  p  puts  the  optimization  problem  out  of  reach  for  complete  enumeration  and  -  as  small  examples  show  - 
simple  greedy  procedures  do  not  solve  the  problem  either.  In  section  3  of  this  paper,  we  formulate  the  order 
preserving  assignment  problem,  give  a  minimal  description  of  the  problem  in  terms  of  linear  inequalities, 
show  that  a  profit-optimal  order  preserving  assignment  can  be  found  in  strongly  polynomial  time  and  prove 
that  the  diameter  of  the  associated  polytope  equals  two.  In  section  4  we  derive  the  corresponding  results 
for  the  case  when  all  p  positions  must  be  assigned.  In  section  5  we  show  how  to  find  a  profit-optimal  order 
preserving  assignment  in  linear  time,  i.e.  in  time  that  is  linear  in  the  number  of  variables  of  the  problem, 
for  both  cases  that  we  consider.  So,  as  far  as  assignments  are  concerned,  “order  and  contiguity”  reduce 


*Suppoited  in  p4rt  by  grants  from  ASFOB,  ONR  and  the  Aiexander-von-Humboldl  Stiflung. 
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the  set  of  feasible  assignments,  do  not  bring  down  the  diameter  of  the  polytope  (comparr  to  Balinski  and 
Russakoif  [1972]  for  the  classical  assignment  problem),  but  they  make  algorithmic  “life”  easier  (compare 
e.g.  to  Balin8ki[1983,1985],  Kubn[1955],  Munkres  [1957]).  In  section  6  we  discuss  two  modifications  of  the 
basic  model.  Throughout  the  paper  we  assume  that  the  reader  is  familiar  with  the  fundamental  concepts  of 
graph  theory,  linear  programming  and  the  analysis  of  algorithms.  For  a  survey  of  the  polyhedral  theory  that 
we  employ  we  refer  the  reader  to  Grotschel  and  Padberg  [1985].  This  note  is  a  summary  of  the  principal 
results  that  can  be  found  in  detail  in  our  paper  [9]. 

2  Some  definitions 


The  problem  is  defined  on  a  bipartite  graph  G  =  (N,P,E)  where  N  =  {l,...,fi}  ,  P  =  {l,...,p}  and 
1  <  p  <  n.  An  edge  of  G  is  an  ordered  pair  {k,t)  G  N  x  P  and  is  denoted  by  a  single  letter  (e,a, . . .  etc.) 
or  by  its  defining  pair  of  indices.  We  assume  at  first  that  G  is  the  complete  bipartite  graph,  but  we  will 
redefine  the  edge  set  E  after  the  following  definition. 

Definition  1  A  subset  A  C  E  is  an  order  preserving  assignment  (OPA)  if 

(i)  for  every  i  €  W  there  is  at  most  one  a  t  A  such  that  k  £  a. 

(ii)  for  every  t  ^  P  there  is  al  mosl  one  a  6  .4  such  that  t  G  a. 

fiiij  i/ a  =  (k,t)  €  A  and  t  >2,  then  there  exists  ic  6  W,  /c  <  i  such  that  (K,t  -  1)  6  A. 

(iv)  if  a  =  (k,t),b  =  (k,t)  ^  A  and  a  ^  b,  then  either  k  <  k  and  t  <  r  or 
k  >  K  and  t  >  r. 

Properties  (i)  and  (ii)  ensure  that  A  is  an  assignment,  property  (iii)  ensures  that  positions  are  assigned 
contiguously  starting  with  position  1.  whereas  property  (iv)  ensures  that  the  assignment  preserves  the  order 
of  the  items.  It  follows  from  the  definition  of  an  OPA  that  item  k  cannot  be  assigned  to  a  position  t  >  k. 
Hence  there  is  no  need  to  consider  edges  (i,<)  with  t  >  k  and  thus  the  edge  set  £  of  G  is  given  by 

E  =  {(k,t)  ^  N  X  P  \  t  <  k  <n,  \  <t  <  p). 

We  index  the  edges  seguenlially  when  necessary  by 


e  =  (i-  l)n +  >-<(«-  l)/2. 


(1) 


where  t  <  j  <  n  and  l<<<p. 

If  A  C  £■  is  an  OPA  and  e  =  {k,t)  €  A,  then  we  say  that  item  k  is  a.ssigned  to  position  <.  Let  Sf® 
(rather  than  denote  the  space  of  real  vectors  of  length  |£|.  The  support  of  *  8  S?*’  is  the  index  set 
of  the  nonzero  components  of  z.  For  A  C  E,  we  denote  z^  =  (z*)  6  the  characteristic  vector  (or 
incidence  vector)  i.e. 


'  1  0  ife 


G  A 
^A. 


We  define  the  order  preserving  assignment  polytope  OPp  to  be  the  convex  hull  of  the  characteristic 
vectors  of  all  OPAs,  i.e. 

OPp  =  conv  { !■*  G  1  A  C  £  is  an  OPA } . 

The  order  preserving  assignment  problem  then  iS  the  optimization  problem 
(OP)  mai{c^r  [zGOPp"). 

We  call  an  OPA  exact  if  |A|  =  p  It  follows  from  the  definition  of  an  OPA  that  in  an  exact  OPA  item 
k  cannot  be  assigned  to  position  <  if  t  >  n  —  p  +  <,  where  1  <  f  <  p  -  1,  since  otherwise  there  are  not 
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enough  items  left  to  be  assigned.  For  the  case  of  exact  OPAs  we  are  thus  led  to  consider  a  bipartite  graph 
G  =  (N,  P,  D)  where 

D=E-{(k,t)  |n-p  +  t  +  l<t<n,  l<t<p-l}. 

We  define  the  polytope  of  exact  OPAa  OP^^  to  be  the  convex  hull  of  the  characteristic  vectors  of  all  exact 
OPAa,  i.e.  - 

OP~^  =  conv  {x'*  €  S®  I  i4  C  D  is  an  OP  A  with  |i4(  =  p}  . 

Note  that  |D|  =  \E\  ~  p(p  —  l)/2  and  that  is  clearK  '-.ot  full  dimensional.  We  denote  aff(OP^)  the 
affine  hullof  the  polytope  OP^p.  The  corresponding  optimi.'ation  problem  over  OP^p  is  denoted  by  (OP~). 

A  linear  descnpiton  of  a  polytope  is  a  set  of  linear  inequalities  and/or  equations  whose  set  of  solutions 
equals  the  poly  tope.  A  linear  description  is  minimal,  if  none  of  the  inequalities  and/or  equations  can  be 
dropped  from  it  without  changing  the  solution  set,  i.e.  every  inequality  defines  a  facet  of  the  polytope. 
Evidently,  we  are  interested  in  finding  minimal  descriptions  of  the  polytopes  OP"  and  OP^p. 


3  Assigning  at  most  p  positions 

We  show  first  that  OP"  is  a  polytope  of  full  dimension. 

Proposition  1  dim  OP"  =  |E|  =  np  — p(p-  l)/2  . 

To  formulate  the  problem  we  propose  the  following  system  of  linear  inequalities  in  zero-one  variables 
which  on  first  sight  has  little  to  do  with  “assignments” . 


n 


< 

1 

(2) 

*+i 

< 

f<lr<n-l,  l<f<p-l 

(3) 

i=<+i 

Xjt 

€ 

{0.1} 

<  <  *  <  n  1  <  ‘  <  P 

(4) 

Inequality  (2)  states  that  at  most  one  item  can  be  assigned  to  position  1,  while  (3)  expresses  the  condition 
that  if  some  items  <  -(- 1, . . . ,  t  -h  1  are  assigned  to  position  <  -f- 1  then  at  least  that  mzuiy  items  from  t,..  .,k 
must  be  assigned  to  position  t  where  (  and  i  are  as  specified.  Inequalities  (3)  are  intuitively  not  readily 
understood  and  were  obtained  by  calculating  iy  computer  all  of  the  facets  of  OPp  for  small  values  of  n,p 
and  subsequent  generalization. 

Proposition  2  The  ayatem  (2),  (3)  and  (4)  is  a  formulation  of  the  order  preserving  assignment  proilem,t.e. 
every  solution  to  (2),  (3)  and  (4)  is  the  incidence  vector  of  an  OPA  and  vice  versa. 

Proposition  2  establishes,  in  particular,  the  validity  of  inequalities  (2)  and  (3)  for  the  polytope  OP". 
There  are  n(p  —  I)  —  p(p  —  l)/2  +  1  constraints  of  the  form  (2)  and  (3)  and  the  question  is  whether  or  not 
we  need  all  of  them.  The  following  proposition  shows  more  thrui  that. 

Proposition  3  Every  inegualiiy  (t)  or  ($)  defines  a  facet  of  the  polytope  OPp  .  Moreover,  the  facets  defined 
by  (S)  and  (S)  are  all  distinct. 

The  inequalities  (2)  and  (3)  are  certainly  not  all  the  facets  of  OPp  as  we  will  need  nonnegativity  as 
well.  However,  inequalities  (3)  for  i  =  (  read  Xu  >  and  thus  we  are  lead  to  consider 

^  0  ,  zy,  >  0  for  f  +  1  <  J  <  n  ,  1  <  f  <  p  . 


(5) 
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Proposition  4  The  inequalities  (5)  define  distinct  facets  ofOPp.  Moreover,  these  facets  art  distinct  from 
the  ones  given  by  (2)  anii  (S). 

Theorem  1  OPp  =  {*  €  satisfies  (2),  (3)  and  (5)}  for  all  I  <  p  <  n.  The  linear  system  (2),  (3)  and 
(5)  15  a  minimal  description  of  OPp  and  the  constraint  matrix  given  by  (2)  and  (3)  15  totally  unimodular. 

Sketch  of  Proof:  Consider  the  |£|  x  If*!  matrix  T  given  by  the  transformation 


k 

yti  =  ^  ij,  for  t  <  i  <  n  ,  1  <  t  <  p  .  (6) 

i=« 

The  matrix  T  is  nonsingular  and  the  inverse  transformation  T~  ‘  is  given  by 

Xfi  =  yii  <  =  C^) 

Zki  =  Vkt- yt-i.t  t+l<k<n,l<t<p.  (8) 

Applying  the  transformation  (6)  to  inequalities  (2),  (3)  and  (5)  we  find 

Vt+i.i+i  -  y4i  <  0  1  <  t  <  n  -  1  ,  !  <  i  <  p  -  1  (9) 

yk-t.i-yu  <  0  <+l<it<n,l<<<p  (10) 

-ypp  <  0  (11) 

ym  <  1  (12) 


The  matrix  corresponding  to  the  constraints  (9),. . .,( 12)  is  totally  unimodular.  Consequently,  all  the  extreme 
points  of  X  =  (z  €  R®  I  X  satisfies  (2),  (3)  and  (5)}  are  zero-one  valued  and  hence,  by  Proposition  2, 
X  =  OPp.  It  follows  that  the  linear  system  (2),  (3)  and  (5)  is  a  linear  description  of  OP".  By  Propositions 
3  and  4  we  have  its  minimality.  The  total  unimodularity  of  the  constraint  matrix  given  by  (2)  and  (3)  is  a 
consequence  of  the  total  unimodularity  of  the  matrix  T  of  our  transformation  and  the  total  unimodularity 
of  (9) . (12),  see  e.g.  Cook  [1983].  □ 


Figure  2.  OPA  network  for  n=4  and  p=:3. 

The  proof  of  Theorem  1  suggests  a  “reformulation”  of  the  order  preserving  assignment  problem  in  terms 
of  variables  yii,  where 


{1  if  one  of  the  items  t, . . . ,  k  is  assigned  to  position  ( 

0  if  not 

and  <  <  F  <  n,  1  <  (  <  p.  The  constraints  (9)  and  (10)  can  then  be  interpreted  accordingly,  while 
constraints  (11)  and  (12)  ensure  that  all  variables  assume  values  between  zero  and  one.  More  precisely,  the 
optimization  problem  (OP)  can  be  written  as  follows; 


max{c^x  j  I  e  OPp] 


(13) 
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=  max{e^z  |  x  €  satisfies  (2),  (3)  and  (5)}  (14) 

=  mazjd^y  |  y  €  3f®,  y  satisfies  (9) . .  (12)}  (15) 

where  the  vector  d  has  components  d„i  =  c„,,  dkt  =  c*,  —  ct+i,i  for  1  <  i  <  n  —  1,  1  <  t  <  p.  Any  optimal 
solution  to  (15)  is  translated  to  an  optimal  solution  to  (OP)  via  (7)  and  (8). 

Corollary  1  Problem  (OP)  can  be  solved  in  strongly  polynomial  time. 

Let  Y  be  the  image  of  X  under  the  transformation  (6),  i.e.  y  =  {y  6  S®|y  satisfies  (9), . . . ,  ( 12)}. 
Denote  H  =  (E,F)  the  OP  A  network  associated  with  the  linear  program  (15)  where  the  nodes  E  of  H 
correspond  to  the  edges  of  the  graph  G,  the  arc  set  F  is  given  by  (9)  and  (10)  and  rather  than  having  edge 
weights  we  have  the  node  weights  defined  above.  We  define  a  c«l  (S  :  E  —  S)  in  H  to  be  feasible  if  5  =  0  or 
S  =  E  or  if  0  5  C  E  then  (i)  (n,  1)  €  S,  (p,p)  ^  S  and  (ii)  for  all  arcs  a  =  (ha.la)  &  (S  ■  E  -  S)  ha  i  S, 

ta  €  S,  where  ha  stands  for  “head”  and  ta  for  “tail”  of  the  arc  a.  The  “tail”  ta  of  an  arc  a  corresponds  to 
a  “minus”,  the  “head”  ha  to  a  “plus”  in  the  corresponding  inequality  (9)  or  (10).  Furthermore,  for  S  C  E 
we  denote  Hs  -  (S,  Fs)  the  sub-network  of  H  with  nodes  in  S  and  all  arcs  of  F  having  both  endpoints  in 
5.  To  prove  the  following  we  use  repeatedly  the  fact  that  the  incidence  matrix  of  a  tree  with  ib  >  1  nodes 
has  a  rank  of  b  -  1 . 

Proposition  5  Every  feasible  cut  (S  :  E  —  S)  in  H  defines  an  extreme  point  ofY  and  vice  versa. 

The  linear  program  (15)  thus  consists  of  finding  a  maximal  weighted  feasible  cut  in  the  network  H  where 
the  weight  of  a  cut  (S  :  E  —  S)  is  given  by 

Two  extreme  points  y'  ^  y^  of  a  polyhedron  P  are  called  adjacent  if  the  face  of  minimal  dimension  of 
P  that  contains  both  y‘  and  y^  has  a  dimension  of  one.  For  any  two  extreme  points  y'  ^  y^  of  P  denote 
d(y*,y’)  the  smallest  number  of  faces  of  dimension  one  of  P  that  -  in  a  simplex-type  algorithm  -  must 
be  “traversed”  in  order  to  get  from  y‘  to  y’.  The  diameter  of  a  polyhedron  P  denoted  diam(P)  is  the 
maitmum  of  d(y* ,  y*)  over  all  pairs  of  extreme  points  y'  and  y*  of  P. 

Proposition  6  Let  y°,  y*  be  the  two  extreme  points  ofY  corresponding  to  S  =  9,  S  =  E  respectively. 

(i)  y®  and  y‘  are  adjacent  on  Y  and  every  other  extreme  point  y  of  Y  is  adjacent  to  both  of  them. 

(ii)  Two  extreme  points  y^  ^  y^  ofY,  both  different  from  y®  and  y*,  are  adjacent  if  and  only  ifSCT  and 
tfr-s  IS  connected  or  T  C  S  and  Hs-T  »»  connected  where  S  =  {(k,t)  S  E  |  y|,  =  1}  and  T  =  {(h,t)  6 
E\yl,-i}. 

(iii)  diam(Y)  =  2  for  all  n  >  3  and  p>  2. 

For  any  order  preserving  assignment  A  C  E  let  pa  be  the  last  position  assigned  by  A.  A  =  0  is  called  the 
trivial  and  A  =  {(1,1),.. .,  (p,p)}  the  canonical  OP  A.  An  OP  A  B  dominates  the  OPA  A  if  ps  >  Pa  and 
there  is  a  position  t  such  that  for  every  (k,t)  €  A  there  exists  h  <  k  such  that  (h,t)  e  P  for  all  1  <  t  <  x 
and  for  all  X  -^  1  <  <  <  pa  (k,t)  e  A  implies  (k,t)  g  B  where  1  <  x  <  p^.  Note  that  the  canonical  OP  A 
dominates  all  (non^rivi^d)  OP  As.  We  say  that  two  OPAs  are  adjacent  if  their  incidence  vectors  z^,  z^  are 
a  pair  of  adjacent  extreme  points  of  OPp- 

Theorem  2  (i)  The  trivial  and  the  canonical  OPAs  are  adjacent  and  every  other  OPA  is  adjacent  to  both 
of  them. 

(ii)  Two  nontrivial,  noncanonieal  OPAs  A  and  B  with  pa  <  Pb  sr^  adjacent  if  and  only  if  B  dominates  A. 
(lii)  The  diameter  of  OP”  equals  two  for  aUn>3  and  p  >  2. 

The  proof  of  Theorem  2  follows  directly  from  Proposition  6  since  the  transformation  T  given  by  (6)  is 
nonsingular  and  by  interpreting  the  condition  for  adjiuiency  given  by  Proposition  6  on  the  bipartite  graph 
G  =  (N,  P,  E). 
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4  Assigning  exactly  p  positions 

To  deal  with  the  problem  of  assigning  exactly  p  positions  in  an  order  preserving  manner  we  use  zero-one 
variables  Xjt  as  before,  but  here  —  p+t  and  1  <  f  <  p  as  we  are  working  in  a  lower-dimensional 

space.  The  exactness  of  the  assignment  implies  that  every  zero-one  vector  z  e  R®  that  is  the  incidence 
vector  of  such  an  OP  A  satisfies 

n»p-f(  • 

^  i„  =  i  l<t<p.  (16) 

]=< 

Proposition  7  dtmOP~|,  =  |D1  —  p  =  p(n  -  p)  ond  ajSHOP~^)  =  ji  6  1  xsaiisjies  (16)}  . 

Consider  now  the  inequalities 


fc+1 

k 

E 

t<k<n  —  p  +  t  —  1  <  t  <  p  —  i 

(17) 

j=<+i 

j^t 

^pp  ^  ^  1 

1  -p+ 1 . 1  ^  0  ,  n* 

i>0  <-t-l<j<n  —  p-|-(  —  1.  1<I<? 

(18) 

Theorems  OP,fp  =  |  jr  salis^es  ( 16K  ( 17)  and  ( 18) }  for  all  1  <  p  <  n.  Moreover,  the  linear 

system  (16),  (IT)  and  (IS)  is  a  minimal  description  of  OP^j,  if  p  <  n. 

Corollary  2  The  inequalities  (17)  and  (IS)  define  distinct  facets  of  OP^j,  and  every  facet  of  OP^j,  is 
defined  by  one  of  the  inequalities  (17)  or  (IS). 

As  the  polytope  OP^p  is  not  full  dimensional,  there  are,  of  course,  many  different  linear  descriptions 
of  OP^p  which  are  all  equivalent  modulo  linear  combinations  of  the  equations  (16)  and  multiplication  by- 
positive  scalars.  With  respect  to  the  optimization  problem  (OP~)  over  the  polytope  OP^p  we  show  mutatis 
mutandis  the  following  corollaries. 

Corollary  3  Problem  (OP~)  can  be  solved  in  strongly  polynomial  time. 

Corollary  4  The  diameter  of  OP^p  equals  tico  for  all  p>  7  and  p  +  2  <  n. 

5  Finding  optimal  assignments 

The  dual  to  the  linear  program  ( 1 5)  is  a  minimunn  cost  flow  problem  on  the  OP  A  network  having  a  particular 
objective  function.  In  fact,  rather  than  calling  our  problem  a  minimum  cost  flow  problem  we  shall  refer 
to  it  as  a  minimal  flow  problem.  We  utilize  this  structure  to  give  fast  algorithms  for  the  problems  {OP) 
and  {OP~).  The  basic  idea  of  our  approach  is  to  solve  (15)  by  reducing  the  overall  problem  to  a  sequence 
of  p  smaller  problems.  An  optimal  solution  to  (OP)  is  then  obtained  using  the  duality  theorem  of  linear 
programming.  The  case  of  (OP”)  is  similar. 

Denote  vk,  uti  the  dual  variables  associated  with  constraints  (9)  and  (10)  and  Vpp,  Un+i  i  those  associ¬ 
ated  with  the  constraints  (11)  and  (12),  respectively.  At  a  typical  node  (il.f)  we  have  the  "flow'  situation 
as  illustrated  in  Figure  3.  Consequently,  the  flow  conservation  equation  at  node  (ib,()  is  given  by 

-  +  ut+i.i  -I-  Uk-i.t-i  '  wn  =  dk,  t  <k  <n  .  I  <t<p  ,  (19) 

where  at  the  “border”  of  the  OPA  network  certain  nodes  do  not  exist,  for  instance  the  nodes  (it,0)  for 
t  =  1  and  0  <  t  <  n  —  1.  We  model  the  corresponding  variables  anyway  and  requite  them  to  be  zero  in 
feasible  solutions  and  the  remaining  ones  to  be  nonnegative. 

Ull  =  «ll  =  Un  +  I,(  =  fn.l-l  =  0  2<f<p  (20) 
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(21) 

(22) 


t;to  =  0  0<i<n— 1,  utp  =  Op+l<i:<n 
“o+i.i  >0,t<n>0,t)tj>0  t  <  k  <n  ,  I  <t  <p  . 


Figure  3.  Flow  at  node  (k,t)- 
The  minimal  flow  problem  that  we  have  to  solve  is  given  by 


(NETp) 


“n+l.l 

subject  to  (19),..., (22)  . 


For  I  <  ■;  <  p  we  denote  (NET,)  the  following  rtlaxatton  of  (N ETp): 

min  u„+i,i 

Eubiect  to 


—  Uti  +  Ut+I.i  +  - 

t'n 

= 

dkt 

t  <k  <n ,  1  <t  <q 

(23) 

Ull  =  U(l  =  Un  +  I,l  =  V„ 

,1-1 

= 

0 

2  <  <  <  9 

(24) 

— 

VtO 

= 

0 

0  <  k  <  n  -  1 

(25) 

“n  +  l,l  >  0  ,  Un  >  0  , 

Vkl 

> 

0 

t<k<n,  1  <t<q  . 

(26) 

(N ET^)  is  thus  a  minimal  flow  problem  on  the  partial  network  given  by  the  nodes  {k,t)  with  i  <  k  <  n, 
1  <  <  <  9  and  the  nodes  (^  +  1, 7+  1),  ...,(n,fl+  1)  for  which  we  have,  however,  no  flow  conservation 
equations  and  thus  no  demands  Rather,  the  variables  v„,...,vn4  “surplus”  variables  in  the 

problem  (jVfTT,).  For  notational  convenience  we  write  a  feasible  solution  to  (N ET^)  in  vector  form  (u’,  v’) 
where  we  use  the  following  indexing  of  the  components  of  u* 

U*  =  («iii.  •,uii,u„2,...,ujj,u„4,...,u„)  (27) 

and  likewise  for  w’  where  1  <  q  <  p.  The  “flow  value”  u’^.!  j  is  kept  separately  from  the  vectors.  Note  that 
the  sequential  indexing  of  u  and  v  is  not  by  increasing  sequential  index  (1).  It  follows  that  if  (u*'*'*,  v*'*'*)  is 
a  feasible  solution  to  (N ET^+i),  then  the  “truncated”  vector  (u, o),  say,  that  is  obtained  from  («<■*■',  ti*'*’*) 
by  suppressing  the  n  -  9  last  components  in  and  v*'*'*,  respectively,  defines  a  feasible  solution  to 
[NET^].  Among  the  optimal  solutions  to  (AffTf)  we  want  to  single  out  a  particular  one.  We  recall  that  a 
vector  z  t  S"  is  lexicographically  greater  than  y  g  Sf"  if  *i  =  j/j  for  t  =  1, . . . ,  t  and  xt+i  >  yt+i  for  some 
0  <  t  <  n  -  1. 
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Definition  2  An  optimal  solution  (u*,  v*)  to  (NET^)  is  called  lex-max  if  the  vector  =  (v*i, . . . ,  vfi, 

. . . ,  ujj, . . is  lexicographically  maximal  omon;  all  optimal  solutions  to  (N ET^)  where  1  <  o  < 
P- 

Proposition  8  Problem  {N EH'^)  has  a  unique  lex-max  solution  for  I  <  q  <  p- 
Proposition  9  A  feasible  solution  (u*,v*)  to'(NEn\)  ts  lex-max  if  and  only  if 

Proposition  10  //(«*,«’)  and  (ti*"*"* , t;’"*'*)  are  lex-max  solutions  to  (N EH’^)  and  (N ET^^i),  then 


-  < 

t  <  k  <n  -  \  ,  1  <t  <q 

(29) 

= 

t  <  k  <n -2 ,  I  <  t  <  q 

(30) 

>  <- 

1.,  1<‘<9 

(31) 

To  formulate  a  procedure  that  finds  a 

lex- max 

solution  (u*"*’* ,  t)*'*'' )  to  (NET,+i)  given 

a  lex-max 

solution  (u*,t;*)  to  (N ET^)  we  need  the  following  two  auxiliary  problems.  In  the  first  problem  we  want  to 
find  a  Iex-m20c  solution  (u,  v)  which  solves  the  problem 

min  Un+i 

(P,+l)  subject  to  -U(t  -I-  ut+i  -  vt  —  di  q  +  1  <  k  <  n 

u,+l  =  Un  =  0  ,  ut  ,  >  0  , 

i.e.  a  solution  such  that  the  vector  v  =  (v„, . .  .,u,+i)  is  lexicographically  maximal  among  all  optimad 
solutions  to  (P,+i)'  "The  second  problem  is  to  find  an  optimal  solution  u  which  solves  the  problem 

min  u„+i 

(P2)  subject  to  -ui  -f  ut+i  =dt,ut>0  p  <  k  <  n  . 

In  the  following  procedure  u,  v  and  d  are  assumed  to  be  vector  arrays  of  sixe  lEI  that  are  indexed  as  in 
(34)  and  the  arrays  D,  U ,  V  are  of  size  n-t-  1  and  are  “local”  variables.  Instructions  that  are  separated  by 
a  colon  are  to  be  interpreted  sequentially  (left  to  right).  “FLO”  is  the  flow  value  Un+i,i. 

Procedure  LEXMAX  {n,p,q,u,v,d,  FLO) 

Input:  (u’,w’)  a  lex-max  solution  to  (NET,)  with  flow  value  FLO,  d*  6  Z 

for  q  +  I  <  k  <  n  and  0  <  q  <  p. 

Output:  (u*'*'*, «*■•■*)  a  lex-max  solution  to  (N ET,,,.i)  with  flow  value  FLO. 

Step  1:  if  9  =  0  then 

do  for  fc  =  I  to  n 
Dt  :=  dm 
enddo 
FLO  :=  0 
else 

do  for  k  =  q  +  I  to  n 
Dk  •■=  -  Vk_i,« 

enddo 
endif 

Step  2:  if  9  -t-  1  <  p  then 
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call  procedure  Pl(n,  q,  D,  U,  V) 
else 

call  procedure  P2(n,  p,D,U,  V) 
endif 

Step  3:  do  for  k  =  q  +  I  to  n 

;  «'»,«+!  :=  14 

enddo 

do  for  1  =  1  to  4 

Util  :=  Ufil  +  f^n  +  l  ;  «'n_|  ,  :=  t)n-i,i  +  Un  +  l 

enddo 

FLO=  FLO+Un+i 

return. 

The  procedure  LEXMAX  reduces  the  problem  of  solving  the  minimal  flow  problem  (NETp)  to  a  sequence 
of  optimization  of  problems  (P*+i)  for  0  <  9  <  p  —  2  and  one  optimization  problem  (P2),  all  of  which  can 
be  solved  fast  by  the  following  two  procedures. 

Procedure  PI  (n,q,d,n,v) 

Input:  dk  &  2  for  ^  +  1  <  Jb  <  n. 

Output:  a  Ux-maz  solution  (u,v)  to 

u,+i  :=  0 

do  for  i  =  fl+lton  —  2 

Ui+l  :=  max  {0,  d,  +  u,  )  ;  v,  :=  u,+t  —  U|  -  dj 
enddo 

u„  max  {0, -d„,d„_i  +  Un-i}  ;  Vn-i  =  Un  -  u„_i  -  d„_i 

Ufi+i  :=  d„  +  u„  ;  v„  :=  0 

return. 

It  is  a  routine  matter  to  check  that  procedure  PI  returns  a  feasible  solution  (u,t))  to  (P,+i).  Moreover. 
u,+i  >  0  implies  v,  =  0  and  u,  >  0  implies  u,+i  =  0  in  the  solution  for  9  +  1  <  »  <  n  —  2.  Since  (P,+i)  is 
the  problem  (N ET\),  except  that  we  have  fewer  variables,  Proposition  9  applies  with  the  necessary  changes 
and  thus  the  solution  returned  by  procedure  PI  is  lex-maix.  Procedure  PI  requires  0(n  —  q)  time. 

Procedure  P2  (n,p, d,u,u) 

Input:  d*  6  2  forp  <  k  <  n. 

Output:  an  optima/  solution  u  to  (P2)  and  a  vector  v. 

Up  :=  0 

do  for  j  =  p  to  n 

“>+i  ;=«;  +d,  ;  Vj  :=0 
if  — Uj+j  >  Up  then  Up  ;=  — u,+i 
enddo 

do  for  j  =  p+  lton+l 

Uj  :=  Uj  +  Up 

enddo 

Vp  :=  Up  ;  Up  ;=  0 

return. 

The  interchange  of  and  Up  is  done  in  procedure  P2  to  facilitate  the  updating  in  the  procedure 
LEXMAX.  With  Up  =  tip  the  solution  returned  by  procedure  P2  is  feasible  for  (P2).  To  see  that  it  is 
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optimal  eliminate  ail  variables  except  Up.  The  remainder  is  a  trivial  linear  program  in  the  variable  Up  with 
the  value  as  given  in  the  procedure.  Procedure  P2  requires  0(n  —  p)  time. 

Proposition  11  Pnctdurt  LEXMAX  rttums  a  lex-max  sobtiion  to  {N in  0(n)  time  and  a  lex-max 
solution  to  (NETp)  can  be  found  in  0{np)  time. 

We  are  now  ready  to  formulate  an  algorithm  that  solves  {OP).  In  the  following  algorithm  we  assume 
that  u,  V,  c  and  d  are  arrays  of  length  |£|,  w  is  an  array  of  length  p  and  that  the  rest  are  scalars.  Ail 
auxiliary  arrays  and  soAlars  are  initialized  at  zero,  the  user  supplies  the  data  n,  p  and  c  €  and  provides 
his  own  output  statements. 

Algorithm  OPA  (n,p,  c) 

Input:  neW,  peA/",  l<p<n,  ce  Z^ . 

Output:  an  optimal  solution  vector  x  for  (OP). 

Step  0:  do  for  i  =  1  to  p 

dfi( 

do  for  k  =  t  to  n  —  I 
dki  '■=  Ckt  —  ct+i,( 
enddo 
enddo 

Step  1;  do  for  ^  =  0  to  p  —  1 

call  LEXM AX(n,  p,q,u,v,d,  FLO) 
enddo 

if  FLO  =  0  stop. 
q  :=  1 

Step  2:  if  u„f  =  0  go  to  Step  3  ;  if  =  0  go  to  Step  3  ;  9  :=  fl  +  1  ;  if  9  <  p  go  to  Step  2 

Step  3:  i  ;=  9  +  1  ;  ui(  :=  n  +  1 

Step  4:  (  :=  <  -  I  ;  if  t  =  0  go  to  Step  6  ;  Wi  :=  u;,+i 

Step  5:  wi  :=  Wt  -  I  ;  if  u«,i  =  0  go  to  Step  4  ;  go  to  Step  5 

Step  6:  do  for  <  =  1  to  9 

Xu,,|  —  1 

enddo 

stop. 

Theorem  4  Algorithm  OPA  finds  an  optimal  order  preserving  assignment  in  0(np)  time. 

We  obtain  an  algorithm  for  {OP~)  by  slightly  modifying  the  algorithm  OPA  and  thus  the  corolary. 
Corollary  5  Algorithm  OPA”  finds  an  optimal  solution  to  {OP~)  in  0((n  —  p)p)  time. 


6  Some  extensions 

Suppose  that  we  would  like  to  include  a  cost  of  using  a  position,  but  that  we  otherwise  insist  on  order 
preserving  assignments.  So  let  zt  =  1  if  position  t  is  used,  Z|  =  0  otherwise,  and  ft  be  the  cost  of  using 
position  <,  where  1  <  1  <  P-  Then  the  linear  programming  formulation  for  costly  positions  becomes 

f=l  ftsi  t  =  l 


max 
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subject  to  (2),  (3),  (5)  and  ^  z^i  =  zi  for  f  =  1 ,  ■  ■  ■ ,  p. 
i=‘ 

One  proves  the  implicit  assertion  that  all  extreme  points  (* ,  z)  of  this  linear  program  are  zero-one  valued 
by  showing  that  for  all  integer  c  6  2^  and  /  €  2'’  the  objective  function  value  is  an  integer  number.  To 
do  so,  it  suffices  to  eliminate  the  variables  Z|  from  the  problem  and  to  reduce  the  above  problem  to  (OP) 
with  a  changed  vector  c.  Of  courscj  by  Proposition  2  the  Z(  variables  will  be  either  zero  or  one.  So,  in 
particular,  there  is  also  another  proof  by  discussing  feasible  bases  for  the  problem  and  using  Theorem  1. 

Another  variation  that  we  have  considered  is  the  following.  Suppose  we  want  an  order  preserving 
assignment  such  that  at  least  one  of  the  q  first  items  gets  assigned  to  position  1.  One  proves  by  the 
methods  of  Section  3  that  intersecting  OPp  with  the  single  constraint  i  ^  *1®**  **** 

resulting  poiytope  has  zero-one  valued  extreme  points  only.  Of  course,  the  variables  . . . ,  z„,i  must 

be  zero  in  every  st)lution  and  can  be  dropped  from  the  problem.  It  is  also  not  difficult  to  determine  a 
minimal  description  for  the  resulting  problem.  If  we  intersect  OP"  with  several  constraints  of  the  above  or 
a  similar  form  we  can,  of  course,  not  expect  to  get  a  polytope  with  zero-one  extreme  points  only. 


Note:  Following  the  presentation  of  the  original  paper  at  the  ORSA/TIMS  Joint  National  meeting  in  Oc¬ 
tober  1991,  Maurice  Qeyranne  (University  of  British  Columbia)  and  independently,  Arie  Tamir  (Technion. 
Tel  Aviv)  have  found  direct  “combinatorial”  algorithms  -  based  on  dynamic  programming  -  for  the  solution 
of  the  optimization  problem  (OP)  that  have  the  same  running  time  as  the  algorithm  proposed  by  us. 
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In  this  report  we  are  concerned  with  the  numerical  solution  of  the  non¬ 
linear  min-max  problems  using  smooth  penalty  functions.  The  min-max 
problem  is  stated  as  follows: 

where  the  /,  :  3?”  i-+  3?  are  continuously  differentiable  functions  and  M  = 
{1, . . . ,  m} .  The  min-max  problem  is  known  to  be  equivalent  to  the  follow¬ 
ing  nonlinear  program: 
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[MINMAX] 

minimize  z 
XyZ 

subject  to  fi[x)  <2  V  t=r,  ...,m 


We  propose  to  compute  an  approximate  solution  to  problem  [MINMAX] 
by  solving  the  following  problem: 

[MINSEP] 

m 

min  z  +  /i  V  f{fi{x)  -  z) 

where  f  is  a  smooth  penalty  function  and  fj,  is  its  controlling  parameter. 
We  consider  two  choices  of  the  penalty  function  f:  (1)  A  linear-quadratic 


penalty  function 


if  t  <  0 

if  0  <  t  <  €  (3) 

if  <  >  € 


and,  (2)  A  linear-logarithmic  penalty  function 


{0  if  t  <  0 

(f  +  a)  [in(^)  -  l]  +  a  if  0  <  t  <  e.  (4) 

t  -  {€  —  a)  if  t  >  €. 

The  penalty  functions  (3)-(4)  can  be  obtained  as  smooth  approximations  to 
the  ii  penalty  function 

r(t)  =  max{0,t}.  (5) 

We  note  that  the  penalty  functions  (3)-(4)  have  continuous  first  derivatives 
whereas  the  £i  penalty  function  does  not.  The  parameter  c  controls  the 
accuracy  of  the  approximation.  We  give  the  following  definition  before  we 
proceed: 
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Definition  A  vector  (x,z)  is  £-feasible  if 

fi{x)<z  +  €  V  t  =  l,...,fn. 

We  established  in  [5]  that  the  threshold  value  of  the  penalty  parameter  /x 
required  in  order  to  achieve  €-feasibility  is  a  function  of  the  maximum  of  the 
Lagrange  multipliers  to  the  original  problem.  However,  the  Lagrange  mul¬ 
tipliers  associated  with  the  inequalities  in  problem  [MINMAX]  are  in  the 
interval  [0, 1].  This  fact  is  readily  verified  from  the  first  order  Karush-Kuhn- 
Tucker  optimality  conditions  for  [MINMAX].  As  a  result  of  this  observa¬ 
tion  in  an  iterative  scheme  where  the  solution  of  problem  [MINMAX]  is 
approximately  obtained  by  solving  a  finite  number  of  penalty  problems  one 
can  start  with  a  value  of  ^  in  the  range  [0, 1].  This  property  enables  one  to 
alleviate  the  problems  associated  by  forcing  too  much  ill-conditioning  on  the 
penalty  problem  from  the  beginning.  We  construct  such  an  iterative  scheme 
below  to  solve  min-max  problems  by  means  of  unconstrained  minimizations. 
To  start  the  algorithm,  an  initial  point  (xq,  xq)  is  needed  and  initial  values 
fiQ ,  £o  for  the  penalty  parameter  fi  and  the  smoothing  parameter  c  need  to 
be  specified.  We  consider  the  following  iterative  steps: 

Step  1  Using  the  violation  (xfc,zjt)  the  starting  point  for  evaluaring 
the  penalty  function  solve  the  problem  [MINSEP].  Let  {xk+i,zk+x) 
denote  the  optimal  solution. 

Step  2  If  (x/b+i, X/t-fi)  is  £-feasible  and  e  <  ejin  terminate.  Otherwise, 
update  the  penalty  parameter  /x  and  and  the  accuracy  parameter  c, 
set  k  k  +  I  and  proceed  from  Step  1. 

The  parameter  is  a  final  infeasibility  tolerance.  At  Step  2,  we  update 
the  penalty  parameter  /x  and  the  smoothing  parameter  c  as  follows; 

^*+1  =  m  max 
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and 

A^fc+i  =  V2f^k 

where  r]i  =  0.1  and  1)2  =  2.0.  The  above  choices  were  found  to  work  best 
throughout  the  computational  tests  performed.  Obviously  the  effectiveness 
of  the  iterative  scheme  is  contingent  upon  the  success  of  the  unconstrained 
minimizations  in  Step  1.  We  use  a  modified  Newton  method  due  to  Gay  [2] 
to  solve  the  penalty  problem.  The  modified  Newton  algorithm  iterates  by 
computing  an  optimal  locally  constrained  step  with  a  trust  region  on  the 
step.  It  uses  a  finite-difference  estimation  of  the  Hessian.  It  is  available  as 
a  subroutine  in  the  IMSL  Library. 

We  summarize  our  computational  experience  using  a  test  problem  from 
[3]. 

min  max  /,(x) 

with 

f2{x) 

hi^) 

fi{x)  =  Xx 

/,+i(i)  =  1 1 

fm{x)  -  Ii +12  +  *3  +  +  -  1, 

where  m  =  2n-2,  n  even.  We  have  run  the  example  with  n  =  4,10,20,30,40,60,70. 
and  80. 


=  a;J-l-3r2 +  3^3  +  •••  +  Xn  -  1, 

=  Xi  -f-  I2  -f-  X3  —  1, 

=  Zi  +  2l2  +  X3  H - -H  In  -  1, 


+  12  +  I3  +  •  •  •  +  Ii±2  +  •  •  •  +  In  -  1 

*  >  I  even 

4-  I2  +  I3  +  '  •  •  +  2l^  -f-  •  •  •  -f  In  -  1 
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This  problem  was  chosen  to  test  the  behavior  of  the  algorithm  on  problem 
with  many  variables  and  functions.  We  also  solved  some  of  these  problems 
using  (1)  The  nonlinear  programming  software  MINOS  5.1  of  Murtagh  and 
Saunders  [4]  to  verify  the  accuracy  of  our  solutions,  and  (2)  The  recently 
developed  nonlinear  programming  software  Lancelot  of  Conn  et  al.  [1].  The 
results  are  reported  in  Table  5.  The  starting  point  in  all  runs  was  taken  to 
be  (10,10,...,0), 


(n,  m) 

LQP 

MINOS 

i| 

Lancelot  j 

MIT 

NIT 

NF 

NG 

NH 

NF 

NF 

NG 

(4,6) 

D 

33 

59 

37 

33 

NA 

50 

44 

(10,18) 

■I 

43 

81 

47 

43 

96 

196 

166 

(20,38) 

3 

27 

54 

30 

27 

175 

288 

247 

(30,58) 

3 

43 

69 

46 

43 

249 

288 

249 

(40,78) 

3 

39 

81 

42 

39 

NA 

362 

349 

(60,118) 

3 

40 

82 

43 

40 

NA 

465 

418 

(70,138) 

3 

41 

98 

44 

41 

NA 

533 

478 

(80,158) 

3 

25 

68 

28 

25 

NA 

656 

581 

Table  5:  Solution  Statistics  for  the  Minima^x  using  the  Linear- Quadratic 
Penalty  (LQP),  MINOS  and  Lancelot.  (MIT  =  Number  of  Penalty  Mini¬ 
mizations,  NIT  =  Number  of  Newton  Iterations,  NF  =  Number  of  Function 
Evaluations,  NG  =  number  of  Gradient  Evaluations,  NH  =  Number  of  Hes¬ 
sian  Evaluations,  NA  =  Not  Available) 

We  note  that  all  three  codes  were  run  with  default  parameters.  We  do 
not  report  the  solution  computed  by  the  LQP  algorithm  since  we  find  that 
we  have  on  the  average  four  digits  of  accuracy  with  respect  to  the  MINOS 
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solution.  The  accuracy  of  the  solution  reported  by  Lancelot  is  controlled 
using  the  parameter  gradient.tolerance.  In  our  experiments  its  value  is 
set  to  10“^ 
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The  objective  of  the  Environmentally  Sensitive  Investment  System 
(ESIS)  Project  is  to  provide  industry,  as  well  as  government  departments 
and  agencies,  with  possible  means  to  assess  the  environmental  and 
economic  implications  of  capital-intensive  projects  and  policies.  More 
specifically,  ESIS  helps  to  find  wastewater  management  alternatives  that 
meet  stated  environmental  regulatory  standards  in  a  technologically 
sound  and  cost-efficient  manner.  The  use  of  this  intelligent  decision 
support  system  will  enhance  the  ability  of  managers  and  planners  to 
explore  the  quantitative  implications  of  a  wide  range  of  options. 

RSIS  incorporates  a  combination  of  artificial  intelligence,  expert 
system  and  operations  research  techniques,  database  management  and 
graphic  presentation  tools,  integrated  into  a  user  friendly,  dialog  and 
menu  driven  software  system.  ESIS  is  targeted  primarily  for  top-of-the- 
line  personal  computers  and,  possibly  at  a  later  stage,  for  workstations. 


Keywords:  environment-economy  integration,  intelligent  decision  support 
systems,  pulp  and  paper  industry,  waste  management. 
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1.  Introduction 

Major  industries  -  such  as  Bining,  cheaical  production,  pulp  and  paper 
manufacturing,  food  i»xx:essiJ3g,  etc.  -  typically  have  a  significant 
negative  impact  on  the  ambient  environmental  quality,  niese  industries 
face  a  serious  challenge  (ez]^?essed  by  a  groming  public  concern  and  more 
stringent  environmental  legislation)  to  find  efficient  methods  of  waste 
treatment  and  disposal,  idiile  acting  under  the  financial  constraints  of 
market  realities. 

KSIS,  the  Environmentally  Sensitive  Investment  System,  is  designed  to 
provide  quantitative  assistance  in  this  complex  decision  making  process. 
Specif iceLLly,  SSIS  assists  in  selecting  waste  water  management  options 
that  meet  jointly  considered  technical  and  economic  constraints,  as  well 
as  environmental  regulatory  criteria  in  an  efficient  manner.  With  the 
help  of  ESIS,  senior  industry  and  government  decision  makers  will  be 
able  to  systematically  analyze  and  compare  strategic  investment  and 
operational  choices  in  an  interactive,  computer-assisted  environment. 

Althou^  the  ESIS  prototype  is  initially  focused  on  the  economically 
mature  Canadian  pulp  and  paper  industry  -  laore  precisely,  mechanical 
oulping  (TMP/CITIP)  mills  -  its  broader  conceptual  relevance  is  evident 
with  respect  to  a  number  of  capital-intensive  industries  that  have  a 
potentially  large,  negative  environmenteul  impact.  Several  possible 
extensions  and  generalizations  of  the  prototype  system  will  be 
bi^li^ted  later. 

The  ESIS  Project  is  supported  by  a  consortivn  of  industry,  consulting, 
research  and  government  partners  which  represents  a  range  of  interests, 
expertise  and  a  variety  of  multi-disciplinary  professional  assistance  to 
the  Project. 
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2.  Envirozment-Bconoiny  Integration:  The  ESIS  Model  Structure 

Aa  stated  above,  ESIS  will  be  a  quantitative  decision  support  tool  that 
assists  in  finding  harmonized  econonic-environmental  policies.  This 
objective  is  reflected  by  the  generic  system  scheme  shown  in  Figure  1 : 
waste  management  in  the  pulp  and  peiper  industry  can  be  considered  as  one 
of  its  possible  realizations. 

Figure  1  indicates  that  there  are  several  system  components  which  allow 
for  implementing  diverse  types  of  management  and  control  options. 
Because  of  the  adopted  scope  limitations  of  ESIS,  a  particular  emphasis 
is  placed  on  the  selection  of  technologically  feasible,  environmentally 
satisfactory  and  economically  efficient  waste  water  treatment 
alternatives.  An  attempt  is,  however,  made  to  include  -  at  least  on  the 
level  of  quantitative  sensitivity  analysis  -  some  other  management 
opportunities:  plan  modifications,  material  recycling  and  reuse  options, 
non-standard  waste  treatment  and  disposal  practices,  and  adaptively 
chosen  environmental  regulations  (effluent  quality  standards). 


3.  Waste  Water  Treatment  Engineering  System  Model 

The  waste  water  treatment  systems  considered  in  the  ESIS  prototype  are 
determined  by  a  given  configuration  of  unit  operations  applicable  to 
mechanical  pulp  and  paper  mills.  Figure  2  depicts  the  schepte  of  an 
investigated  treatment  system. 

The  following  unit  treatment  processes  (UTP's)  are  included  in  ESIS 

« 

(their  role  will  be  briefly  described  in  the  full  version  of  the  paper): 

-  primary  clarifier 

-  spill  basin 

-  equalization  basin 

-  activated  sludge  treatment 
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-  aerated  lagoon 

-  secondary  clarifier 

-  aludge  mixer 

-  sludge  thickener 

-  sludge  dewatering 

For  each  UTP,  a  quantitative  steady-state  description  has  been  developed 
which  makes  possible  the  formulation  of  a  corresponding  mathematical 
system  model  of  complete  treatment  configurations.  The  input  of  this 
system  is  basically  determined  by  the  pulp  and  paper  manufcicturing 
process  which  results  in  a  given  production  level  and  corresponding 
generated  raw  waste  stream.  The  output  (treated)  waste  stream  affects 
the  spatial  and  temporal  evolution  of  the  ambient  environmental  quality. 

In  the  ESIS  prototype  four  selected  basic  waste  water  treatment 
configurations  can  be  investigated.  The  pollution  removal  efficiency  and 
cost  of  each  IfTP  are  governed  by  choosing  a  few  (2-6)  principal  design 
variables  and  operational  characteristics.  Hence,  finding  feasible  or 
beat  investment  and  operational  decisions  necessitates  the  simultaneous 
choice  of  several  tens  of  decision  variables. 


4.  Analytical  Optimization  Model  Formulations 

Environmentally  sensible  planning  means  the  integral  consideration  of 
multiple  (partially  conflicting  and  non-conmensurable)  objectives, 
criteria  and  aspects,  which  include  those  of  engineering  (technical 
feasibility),  economics  (cost-efficiency  of  design  and  operations)  and 
environment  ("satisfactory”  quality,  sustainable  development).  This 
inherently  "soft"  problem  structure  excludes  the  possibility  of  finding 
a  unique  best  representation,  as  reflected  by  a  single  mathematical 
progranming  model.  Therefore  the  verbal  problem  statement  below  is 
only  a  possible  formulation  of  the  basic  ESIS  paradigm. 
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ESIS  Optinization  Model  Frane 


OBJECTIVE 

{find  the  most  coat-efficient  combination  of  unit  waste  water 
treatment  processes} 

STATE  EQUATIONS  AND  CONSTRAINTS 

{production  process  driven  initial  conditions} 

{input  waste  stream  characteristics} 

{explicit  ranges  of  decision  variables:  waste  water  treatment 
equipment  types  and  sizes} 

{explicit  ranges  of  decision  variables:  basic  operational 
characteristics  of  ITTP's} 

{implicit  technological  constraints  and  relations  between  IfTP's} 
{analytic  description  of  the  waste  water  treatment  process} 

{implied  resource  (construction,  operation  and  maintenance)  demands 
of  the  waste  water  treatment  system} 

{resulting  effluent  and  solid  waste  stream} 

{effects  of  waste  output  on  environment} 

{environmental  quality  standards} 


The  concrete  numerical  realization! s )  of  the  above  optimization  paradigm 
typically  involve  several  tens  of  decision  variables  and  constraints, 
plus  hundreds  of  constants  and  parameters.  In  addition,  essential  model 
features  include  complicated  ("black  box"  like)  structure,  nonlinearity, 
absence  of  analytical  derivatives  of  a  number  of  implicit  constraints, 
significant  uncertainties  and  fluctuations  (stochasticity)  of  system 
behaviour,  and  -  as  a  direct  consequence  of  the  above  facts  -  computa¬ 
tionally  intensive  evaluation  of  the  engineering  system  optimization 
model.  In  the  terminology  of  mathematical  programming,  even  the  determi¬ 
nistic  bzme  model  versions  can  merely  be  classified  as  (potentially) 
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multiextremal  Lipschitz  optimization  problems,  with  no  obvious  possibi¬ 
lity  of  directly  justifying  a  more  narrow  model-class.  Sumning  up  these 
observations,  the  exact  finding  of  the  (mathematically)  optimal  deci¬ 
sion,  or  better,  of  a  ''nienu"  of  alternative  efficient  decisions,  typi¬ 
cally  leads  to  complicated  optimization  issues. 

5-  Solution  Approaches 

Given  the  spectrum  of  targeted  applications  (from  obtaining  fast,  pre¬ 
liminary  evaluation  of  system  performance  to  detailed  analysis  and  "fine 
tuning"),  KSIS  offers  a  ntnnber  of  tools  that  can  be  applied  to  form¬ 
ulate,  investigate  and  solve  system  models  of  diverse  levels  and  depths. 
These  techniques  include 

-  knowledge-based  reasoning  and  spreadsheet  Calculations 

-  nonlinear  programning  model  versions  and  optimization  techniques 

-  statistical  uncertainty  analysis  and  stochastic  model  extensions 

The  solution  approaches  listed  above  are  capable  to  provide  response 
pertinent  to  a  large  variety  of  problem  statements.  The  information 
obtained  in  the  different  search  modes  is  comminicated  to  the  user  via  a 
dialogue-based,  seamless  interface  which  also  enables  the  adaptive, 
sequential  application  of  the  solution  techniques.  (Illustrative  test 
results  obtained  by  applying  these  solution  approaches  will  be  provided 
in  the  full  version  of  the  paper. ) 


6.  Implementation  Aspects 

The  combination  of  optimization,  statistical  analysis,  ei^rt  system, 
database  management,  and  visualization  concepts  and  techniques  is 
realized  applying  a  modular  approach.  This  structure  makes  possible  the 
adaptive  decomposition,  aggregation  and  substitution  of  the  ESIS 
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software  components.  (The  parallel  testing  of  different  development 
versions  still  continues.)  The  current  prototype  implementation  runs  on 
top-of-the-line  personal  computers  (IBM  PS/2  386  and  486  machines).  A 
workstation  (IBM  RISC  6000)  development  is  being  proposed  for  a  sub¬ 
sequent  stage. 


7.  Conclusions  and  Further  Research 

The  increasing  concern  for  environmentally  responsible  waste  management 
indicates  that  the  ESIS  concept  and  its  implementation  have  remarkable 
potential.  The  immediate  future  work  includes  additional  testing  and 
improvement  of  the  prototype,  and  further  contacts  with  pulp  and  paper 
mill  personnel,  suppliers  and  equipment  manufacturers,  in  order  to 
obtain  additonal,  site-specific  information.  This  work  will  also  be 
related  to  supplementing  ESIS  with  in-plant  process  modification  options 
and  novel  waste  treatment/disposal/processing/use  techniques. 

The  currently  active  features  of  ESIS  include  an  information  base  on 
environmentally  sensitive  planning  (in  the  context  of  the  pulp  and  paper 
industry);  feasibility  analysis  of  (pre-) selected  management  options; 
preliminary  screening  and  optimization  of  treatment  configurations; 
quantification  of  system  uncertainties  and  fluctuations;  statistical 
verification  (ex-post  analysis)  of  selected  management  options;  and 
report  writing  (executive  sumnary  and  detailed  technical  report  levels). 

Possible  future  extensions  and  generalizations  can  lead  to  site-specific 
implementations  of  the  ESIS  concept;  refinements  of  the  environmental 
impact  analysis;  adaptation  of  the  ESIS  concept  to  other  industries;  and 
educational/professional  training  versions. 

(Acknowledgements,  appendix  and  an  extensive  reference  list  will  be 
provided  in  the  complete  version  of  the  paper. ) 
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Figure  2.  A  Waste  Water  Treataent  Systea  Configuration,  Applicable  in 
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Abstract:  In  interior  point  methods  it  is  generally  not  possible  to  achieve  a  vertex  optimal  solu¬ 
tion  for  degenerate  linear  programming  problems.  Two  of  the  most  popular  methods  for 
attempting  to  achieve  a  vertex  (near)  optimal  solution  are  the  method  of  random  perturbations 
and  the  method  of  controlled  random  perturbations.  When  a  basis  needs  to  be  recovered  in  addi¬ 
tion,  there  is  need  for  a  number  of  Gaussian  elimination  steps  which  depends  on  the  degree  of 
degeneracy.  In  this  paper,  an  alternative  method  to  force  a  vertex  solution  using  the  dual  affine 
method  is  presented.  The  dual  affine  method  is  an  interior  point  method  which  is  computation¬ 
ally  efficient  but  not  commonly  used  when  a  basis  needs  to  be  recovered.  The  proposed  method 
uses  a  quadratic  auxiliary  function  based  on  the  properties  of  convex  sets  and  does  not  require 
any  random  perturbations.  Results  from  test  problems  are  presented  and  compared  with  an  exist¬ 
ing  method. 

1.  INTRODUCTION 

Interior  point  methods  are  now  proven  to  be  computationally  efficient  for  many  classes  of  large 
linear  programming  problems,  Lustig  et  al.  (1992).  However,  for  degenerate  linear  program¬ 
ming  problems  the  solution  from  interior  point  methods  converge  to  a  relative  interior  of  the 
optimal  face  providing  an  optimal  solution  with  a  minimal  number  of  zero  coordinates  in  the 
solution.  When  compared  with  this  solution,  a  basic  (vertex)  solution  available  from  simplex 
method  has  a  maximal  number  of  zero  coordinates,  a  solution  which  may  sometimes  be  more 
favourable  for  certain  operational  reasons.  In  addition,  for  some  integer  programming  problems 
as  well  as,  for  the  reasons  of  sensitivity  analysis,  parametric  programming,  and  for  providing 
warm  starts  in  some  nonlinear  programming  algorithms,  it  may  be  necessary  to  have  a  basic 
optimal  solution.  The  primal-dual  interior  point  methods  arc  usually  favoured  for  recovering  a 
basic  solution,  Megiddo  (1988)  and  Mchrotra  (1989).  However,  there  have  been  successful 
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attempts  in  using  the  dual  affine  method  for  basis  recovoy  as  well,  for  example,  in  Ponnam- 
balam  and  Vanneili  (1989),  and  Ponnambalam,  et  aL(1992). 

Basis  recovery  procedure  is  expected  to  be  more  efficient  if  a  vertex  solution  is  available  at  least 
for  either  the  primal  or  the  dual  or  preferably  for  both.  In  addition  to  achieving  a  vertex  optimal 
solution,  when  the  basic  set  has  to  be  determined  it  is  necessary  to  perform  an  uncertain  number 
of  Gaussian  elimination  steps  which  depend  on  the  degeneracy  of  the  problem.  The  two 
methods  that  are  popular  for  forcing  a  vertex  solution  using  interior  point  methods  are  (i)  pertur¬ 
bation  method,  for  example,  as  in  Ponnambalam  and  Varmelli  (1989),  and  Qi)  controlled  pertur¬ 
bation  method  of  Mehrotra  (1989).  Both  methods  depend  on  random  perturbations  and  there  is 
some  uncertainty  as  to  the  effectiveness  of  these  methods  in  achieving  a  vertex  opdmal  solution. 
In  this  paper,  we  use  a  recently  proposed  method  of  Ponnambalam  (1992),  wherein,  a  quadratic 
auxiliary  function  is  used  to  encourage  a  vertex  solution  using  a  dual  affme  method.  In  this  pro¬ 
posed  method,  a  bidimensionai  subspace  is  determined  based  on  maximizing  an  auxiliary  func¬ 
tion  which,  in  addition  to  the  original  objective  function,  includes  terms  for  favouring  a  vertex 
optimal  solution  whose  2-norm  squared  has  the  maximum  value  in  the  optimal  face.  Test  results 
are  presented  for  comparison. 

2.  Linear  Programs  and  Degeneracy 

Let  the  primal  linear  programming  problem  be  in  the  standard  form 

(P)  rmn  {c^x  :  Ax  =  b  ,  X  ^0}  (1) 

where,  /I  is  an  mxn  matrix,  b  and  c  are  /n-  and  n-  dimensional  vectors,  respectively,  x  is 
n -dimensional  vector  and  assume  that  rank(A)  =  m.  The  dual  of  the  linear  program  (P)  is 

(D)  max{b^y  :  A^y +  s  =c  ,  s  ^0}  (2) 

where,  y  and  s  are  m-  and  n-  dimensional  vectors,  respectively.  The  pair  of  primal  (P)  and  dual 
(D)  problems  is  called  primal  degenerate  if  there  exists  a  primal  feasible  x  with  less  than  m 
positive  coordinates,  and  dual  degenerate  if  there  exists  a  dual  feasible  s  with  less  than  n-m 
positive  coordinates,  Guler  et  aL  (1991).  At  this  time  it  is  worth  noting  that,  in  the  case  of  inte¬ 
rior  point  methods  and  when  a  vertex  optimal  solution  is  required,  the  main  concern  is  degen¬ 
eracy  at  the  optimal  face,  unlike  in  the  simplex  method,  where  degeneracy  is  of  no  serious  con¬ 
cern  once  the  optimal  vertex  is  reached.  In  this  paper,  we  propose  to  use  the  dual  affine  method 
ot  Dikin  (1967)  and  Adler,  et  aL  (1989),  on  the  dual  problem  (D)  where  due  to  dual  degeneracy 
the  optimal  solution  is  in  the  interior  of  the  optimal  face.  We  do  not  attempt  to  overcome  the 
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primal  degeneracy  in  the  proposed  algorithm  as  it  can  be  resolved  best  during  the  Gaussian  elim¬ 
ination  steps. 

3.  Vertex  Solution  using  Dual  Afline  Method 

For  the  following  sections  we  will  use  the  inequality  form  of  the  dual  problem. 

(D)  max  {b^y  :  A^y  <,c }  (3) 

Using  Proposidon  1  we  form  an  auxiliary  function  and  solve  a  new  optimization  problem  that 
will  provide  a  vertex  solution  for  the  dual  problem. 

Proposition  1:  In  linear  programming  problems  with  dual  degeneracy,  there  is  at  least  one 
optimal  vertex  solution  y*  with  the  following  condition: 

lly*ll2  ^  Ilyll2  ,  for  all  y  in  the  optimal  face. 

Proof.  This  property  follows  from  the  property  of  the  convex  hull  that  defines  the  feasible  region 
in  the  Dual  (D). 

Therefore,  to  obtain  a  vertex  optimal  solution  of  the  Dual  the  following  problem  needs  to  be 
solved.  Solve 

nax{  y^y  :  A^y  ^  c  ,  b^y  }  (4) 

where  z*  is  the  solution  of  the  dual  problem  in  equation  (3).  Although  at  the  outset  the  urobleni 
in  equation  (4)  looks  like  a  hard  problem,  we  show  how  the  dual  affine  algorithm  can  be  modi¬ 
fied  to  solve  the  problem  with  minimal  additional  effort 

Algorithm 

Let  y°  and  y  be  given  such  that  A^y°  <  c  and  y  =  0.9. 

Set  k  :=0 

While  stopping  criterion  is  not  satisfied,  do 

V*  :=  c-A^y* 
b,:=2y* 

Dt;=diag 

vf  V* 

H?=(AD*2a^) 

w, 

W2:=ir*b, 
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When  Wj  and  W2  are  linearly  independent 
Solve  LPl  and  LP2  to  get  Xi  and  X2 

yi+l  —  jr*+Y  X  (XjWi  +  XjW^) 

set  k  :=  Jfc+1 


Problems  LPl  and  LI^.  arc  small  linear  programming  problems  of  2  unknowns  each,  respec¬ 
tively,  and  are  easily  solved.  Similar  linear  programming  problems  have  been  solved  in  Boggs, 
at  al  (1989)  to  ge*.  weights  for  centering  and  affine  directions  in  one  of  the  many  versions  of  the 
■Jual  affine,  algorithm. 

LPl  :  max  b^(XiWj+X2W2)  (5) 


subject  to 


ar.d 


LPl  :  max  by^(XjWi-fX2W2) 


subject  to 


A^(XiWi-bX2W2)  S  V* 
b^(XlW,-^X2W2)  ^  zx* 


(6) 


where  zx  is  the  solution  of  LPl.  It  is  noted  that,  if  Wj  and  W2  arc  linearly  dependent  then  prob¬ 
lems  LPl  and  LP2  may  have  unbounded  solutions.  The  problem  LPl  retains  the  degenerate 
characteristics  of  the  original  problem  in  equation  (3);  that  is,  parallel  lines  in  equation  (3) 
remain  parallel  in  equation  (S).  In  the  case  of  dual  degeneracy  in  (3),  there  is  dual  degeneracy  in 
(5)  but,  the  optimal  face  in  (5)  is  defined  by  two  vertex  solutions  only,  Ponnambalam  (1992). 
Tnerefore,  if  simplex  method  is  used  to  solve  problem  LPl,  we  get  a  solution  for  and  X2  and 
we  get  the  step  length  for  the  dual  problem  in  equation  (3).  Because  of  LP2,  we  also  solve  the 
proLlcm  in  equation  (4)  subject  to  b^y*  ?  z*(X),  where  r*(ifc)  is  the  objective  function  value 
found  at  the  Jfc'*  iteration.  It  is  easy  to  show  that  z*(k)  ->  r*  (the  solution  of  the  dual  problem  in 
equation  3)  when  A:  ->  «<>.  Thus,  we  get  a  dual  optimal  vertex  solution  using  the  algorithm  pro¬ 
posed.  Although  the  dual  affme  method  does  not  seem  to  have  a  proof  of  global  convergence  for 
large  steps,  in  practice,  the  maximum  number  of  iterations  required  to  solve  the  dual  problem  in 
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(3)  is  small,  in  the  order  of  20  to  80. 

4.  Test  Problems  and  Results 

Some  preliminary  testing  of  this  algorithm  has  been  done  for  small  and  medium  size  problems 
using  the  solvers  of  MATLAB^^,  Moler  et  aL  (1987)  for  dense  matrices.  Due  to  the  limitation  of 
space,  only  results  from  solving  selected  problems  are  reported  here.  .All  problems  we  report 
here  have  degeneracy  in  dual  or  dual  and  primaL  The  algorithm  was  also  tested  on  nondegcn- 
eratc  problems  and  had  no  significant  difference  in  performance.  All  the  problc.ms  reported  here 
are  in  the  dual  form  as  in  equation  (3). 

Problem  1  :  max  yi-Htvj 


Subject  to: 

yi-Htyj  ^  4 
y2-*4yj  £  4 
yrvz  ^  0 


-y,-  0 ,  for  all  i 

Problem  1  is  primal  and  dual  degenerate  and  the  two  dual  vertex  solutions  are  (4,4,0)  and  (0,0,1). 
Starting  from  different  feasible  solutions,  for  example  (0.1,0.2,0.1),  the  solution  reached  was 
(3.9996,3.9996,0.0009),  a  solution  close  enough  to  the  vertex  solution  (4,4,0)  and  was  achieved 
in  10  iterations  with  a  duality  gap  of  less  than  10~^.  It  is  noted  that  different  starting  solutions 
could  lead  to  different  optimal  vertex  solutions  which  maximizes  the  2-norm  squared  in  at  least 
in  the  local  sense.  The  following  two  problems  were  used  to  test  the  effect  of  size  as  well  as 
degeneracy  on  the  performance  of  the  algorithm. 

m 

Problem  2  :  max  J^f .  y,- 
i=l 


Ziyi 


(»i 


^  m 


Subject  to: 
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-y,-  ^  0 ,  for  ail  i 

Problem  2  has  only  the  dual  degeneracy  with  vertex  solutions  of  type  (/n,0,...,0), 

for  all  i.  Because  of  the  dense  solvers  used,  only  problems  of 

size  m=5  to  m=200  were  tested:  All  problems  were  solved  in  about  10  iterations  each,  respec¬ 
tively,  reaching  an  optimal  (near)vertex  solution  usually  of  the  type  (m,0,...,0). 

m 

Problem  3  :  max  .  y-, 

i=l 


Subject  to: 

Xi.y,- 

1=1 

m 

2  /  .  y,  m  ,  j=  l,...,/n 
i=U*i 

-y,-  <  0 ,  for  all  i 

The  Problem  3  differs  from  Problem  2  only  in  the  second  set  of  m  constraints  added  to  make  the 
problem  dual  and  primal  degenerate.  For  sizes  m=5  to  »t=200  the  algorithm  was  able  to  con¬ 
verge  to  an  optimal  (near) vertex  solution  in  approximately  16  iterations  each,  respectively,  for 
all  sizes,  mostly  to  the  solution  of  type  (»j,0,...,0).  However,  it  is  noted  that,  the  proximity  to  the 
vertex  in  Problem  3  was  slightly  worser  than  in  Problem  2.  This  may  be  expected  due  to  the  very 
high  primal-dual  degeneracies  present  in  Problem  3. 

Many  more  degenerate  and  nondegenerate  test  problems,  including  the  degenerate  example  sug¬ 
gested  by  Todd  (Tuncel,  1992)  were  tested  successfully  as  reported  in  Ponnambalam  (1992). 
Although  the  solution  obtained  was  dual  feasible,  optimal,  and  very  near  a  vertex,  the  primal 
solution  thus  obtained  need  not  be  feasible.  In  order  to  obtain  primal  feasible  solution,  if  neces¬ 
sary,  the  only  additional  step  taken  so  far  in  our  tests  is  to  perform  2  to  4  additional  standard  dual 
affme  iterations  using  only  the  dual  affine  direction.  Then,  a  basis  recovery  procedure  as  dis¬ 
cussed  in  Ponnambalam  et  al.  (1992)  was  applied  successfully  for  the  above  test  problems 
requiring  a  small  number  of  Gaussian  elimination  steps  which  depended  on  the  starting  point. 
However,  further  research  in  basis  recovery  as  well  as  in  the  following  areas  are  currently  being 
actively  pursued. 
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(i)  Using  a  banier  function  model  {max  )  where  ^  is  initially 

large  and  tends  to  zero  at  the  en± 

(ii)  Using  a  potential  function  model  (max  (n+l)log(6^y-Z£,>+)'^3'+2  ^°S(<^~^Fy)}  where 
is  the  lower  bound.  In  both  cases  (i)  and  (ii),  the  maximization  is  done  using  a  two  or 

three  dimensional  subspace. 

(iii)  Finding  the  step  length  over  a  3  dimensional  subspace,  namely,  that  defined  by  the 
affine,  centering,  and  the  vertex  directions. 

(iv)  Solving  a  small  quadratic  optimization  problem  as  in  Singh  (1992)  to  find  the  step 
length  instead  of  solving  LPl  and  LP2,  and  lastly, 

(v)  Solving  the  Netlib  suite  of  problems  using  the  proposed  algorithm. 

An  interesting  problem  that  may  be  solved  using  an  algorithm  similar  to  that  presented  here  is 
the  bi-level  linear  programming  problem  arising  in  many  decision  analysis  such  as  Game 
Theory,  Bi  (1992). 

5.  Conclusions 

Achieving  vertex  optimal  solutions  in  interior  point  methods  is  expected  to  aid  in  the  faster 
recovery  of  baris  set  in  linear  programming  applications.  The  algorithm  presented  here  has 
potential  for  achieving  such  an  objective.  Further  theoretical  and  practical  work  in  directions 
suggested  here  have  interesting  implications  for  many  types  of  optimization  problems  including 
the  multi-objective  optimization. 
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Extended  abstract 

In  this  paper  we  consider  the  following  linearly  constrained  optimization  problem: 

N 

min{c^x  + 

1=1 

subject  to  (1) 

Ax  =  6 

X  >  0, 
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where  /I  is  an  m  x  n  matrix,  x,  c,  b  are  vectors  of  suitable  sizes,  compatible  with  the 
above  formulation  and  Qi(x),  i  =  1,2,  •  •  ■ ,  yV  are  some  special  polyhedral  (in  other  words: 
piecewise  linear)  functions.  We  assume  that 

Qi(x')  =  Me~rx),  i  = 

where  is  an  r‘-component  vector,  T'  an  r‘  x  n  matrix  and  /,  a  simplicial  function  on 
a  bounded  convex  polyhedron  K'  C  fC‘ ,  i  =  \  ,2,  ■  ■  ■ ,  N . 

A  function  f(z),  z  ^  R'  is  said  to  be  polyhedral  (simplicial)  on  the  bounded  convex 
polyhedron  K  C  /r  if  there  exists  a  subdivision  of  K  into  r-dimensional  convex  polyhedra 
(simplices),  with  pairwise  disjoint  interiors,  such  that  /  is  continuous  on  K  and  linear  on 
each  subdividing  polyhedron  (simplex). 

A  simplicial  function  is  a  special  polyhedral  function.  The  sum  of  a  finite  number  of 
simplicial  functions,  defined  on  the  same  convex  polyhedron  K,  is  a  polyhedral  function. 

Let  5|,  55,  •  •  • ,  5^  designate  the  subdividing  simplices  and  2,1,  z, 2,  •  •  • ,  z,t,  the  set  of  their 
vertices  in  case  of  the  function  fi  and  the  corresponding  convex  polyhedron  K'.  Furthermore, 
let  fi,  =  /i(2ij),  i  =  i  =  1,2,---,  A'.  Then  problem  (1)  can  be  reformulated  in 

the  following  manner: 


N  k, 

min{c^i  + 

1=1  2=1 


subject  to 

(2) 

Ax 

=  b, 

k. 

T'x+ 

Y^Zi,Xi,  =(’, 

2=1 

k, 

^Xi,  =1,  t  =  1,2,  -,/V, 

2=1 

z  >  0, 

A  >  0. 

Many  optimization  problems  can  be  cast  into  the  form  (2).  Below  we  list  some  of  them. 

(i)  Minimization  of  a  convex,  separable  objective  function  with  piecewise  linear  objective 
function,  subject  to  linear  constraints. 
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(ii)  The  discrete  simple  recourse  stochastic  programming  problem  is  shown  to  be  a  special 
case  of  problem  (2),  in  Prekopa  (1990). 

(iii)  The  linearly  constrained  minimum  absolute  deviation  problem  io  also  shown  Ly 
Prekopa  (1990),  as  a  special  case  of  problem  (2). 

(iv)  Problems  with  loose  constraints.  Assume  that  A.b,c,x  are  as  in  problem  (I)  an  ' 
consider  the  problem: 

minc^x 
subject  to 

Ax  =  b 

r‘x  =  s"' 

T^x  =  e 

T^x 

where  are  component  /cclors.  and  T' .T^ ,  ■ '  ■  are  ■ 

n,r^  X  n,  •  •  • ,  X  n  matrices,  respectively.  The  constraints  T'x  —  4',  •  =  1,^,--,A'  aic 
allowed  not  to  be  satisfied  at  the  expense  of  some  penalty.  We  say  that  these  corstiaiius  arc 
“loose”.  If  the  penalty  on  the  tth  constraint  is  given  by  /,(^'  -  r'l),  where  /,  is  a  piecevvise 
linear  function  on  a  bounded  convex  polyhedron  A,  that  has  the  property  mentioned  in  the 
beginning  of  this  section  and  /i(0)  =  0,  then  we  come  to  the  problem: 

N 

min{c^i  +  ^  /,(('  -  7”x)} 

i=l 

subject  to  (•'<) 

Ax  =  6 

X  >  0. 


Using  the  A- representation  for  the  functions  •  •  • ,  /n,  problem  (3)  takes  the  form  oi 

problem  (2). 

(v)  Stochastic  programming  with  recourse  in  case  of  a  discrete  rr.ndom  vector. 

(vi)  Programming  under  probabilistic  constraint. 
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The  purpose  of  the  paper  is  to  present  a  dual  type  method  for  the  solution  of  the  problem, 
together  with  a  fast  bounding  technique.  The  method  has  been  implemented  in  FORTRAN 
77  and  numerical  results  have  been  obtained  on  Sun  SPARC  work  stations.  The  results, 
especially  those  concerning  the  bounding  technique,  are  very  promising. 
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1  Introduction 

Multi-attribute  decision  models  rely  on  two  forms  of  numeric  input:  objective 
data  defining  the  physical  aspects  of  the  alternatives,  states  and  consequences 
of  the  decision  model;  judgemental  data  relating  to  the  decision  maker’s  (DM) 
beliefs  and  preferences.  Although  variations  on  both  forms  of  data  may  have 
great  effects  on  the  outcome  of  the  decision  model,  the  concern  here  is  with 
variations  in  the  judgemental  data. 

In  its  traditional  sense,  sensitivity  analysis  allows  exploration  of  the  effects 
of  variations  in  the  judgemental  input  on  the  ranking  of  alternatives.  However, 
as  pointed  out  by  French  [5]  and  others,  it  also  plays  the  role  of  an  aid  to  the 
elicitation  process  of  decision  analysis  by  focussing  discussion  and  reflection  on 
the  judgemental  data.  Commercially  available  decision  analysis  packages  such 
as  ARBORIST  [12],  HIVIEW  [1]  and  VISA  [3],  however,  allow  only  limited 
forms  of  sensitivity  analysis. 

Recently  Rios  Insua  and  French  [9,  11]  have  developed  a  conceptual  frame¬ 
work  for  sensitivity  analysis  in  multi-criteria  decision  making  which  allows  si¬ 
multaneous  variation  of  judgemental  data  and  which  applies  to  any  paradigm 
for  decision  analysis.  The  principal  problem  in  implementing  the  Rios  Insua- 
French  framework  is  the  heavy  computational  load  which  is  required  to  support 
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it.  This  arises  because  the  analysis  relies  on  distance-based  tools  juid  involves 
the  solution  of  many  mathematical  programming  problems,  which  may  be 
nonlinear  and  nonconvex.  Even  though  the  individual  optimisation  problems 
are  small,  this  represents  a  substantial  load,  particularly  since,  in  order  to 
provide  reliable  information  from  the  sensitivity  analysis,  global  optima  of  the 
nonconvex  problems  must  be  sought.  The  computational  load  may  inhibit  use 
of  the  framework  particularly  as  decision  analyses  are  often  performed  in  the 
context  of  decision  conferences,  where  it  is  desirable  for  sensitivity  analyses  to 
be  conducted  in  near  real  time  preferably  on  a  PC.  Approaches  to  reducing 
the  time  to  perform  the  analysis  include  reformulation  of  some  of  the  mathe¬ 
matical  programs  involved  [7,  10]  and  exploiting  parallelism  within  the  phases 

In  the  following  we  shall  briefly  describe  the  framework  and  its  implemen¬ 
tation  in  a  parallel  environment  consisting  of  a  PC  enhanced  by  the  addition 
oi  a  transputer  board.  This  will  allow  the  necessary  analyses  to  be  undertaken 
in  parallel,  thus  making  it  possible  to  handle  the  computational  load  in  ac¬ 
ceptable  times.  Experimental  results  with  linear  and  bilinear  models  will  be 
'( deluded. 


2  The  Sensitivity  Analysis  Algorithm 

2.1  The  Evaluation  Problem 

Consider  the  evaluation  problem  in  multi-criteria  decision  making  [2]  in  which 
a  finite  set  of  alternatives  a,t  =  1,...,  are  ranked  using  the  evaluation  func¬ 
tion  w)  where  w  is  the  vector  of  judgemental  data.Then  alternative  a,  is 
preferred  to  alternative  Oj  if 


^(a„u;)  >  ^'(aj,u;) 


To  detect  a*  the  alternative  which  currently  ranks  first,  each  judgemental 
input  is  provided  with  an  initial  value  tv^.  Other  information  related  to  the 
judgemental  data  such  as  monotonicity  conditions  on  utilities,  normalisation 
conditions  on  probabilities  or  weights  are  represented  by  constraints  limiting 
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the  values  of  w  acceptable  to  the  decision  maker  (DM)  to  the  set  S  so  that 
ti;  €  5.  Three  cases  occur  in  practice: 

•  linear  models,  in  which  ^  is  linear  and  S  is  defined  by  linear  constraints, 

•  bilinear  models,  in  which  is  bilinear  and  S  is  defined  by  linear  con¬ 
straints, 

•  general  models,  in  which  is  nonlinear  and  S  not  necessarily  convex. 

2.2  Algorithm 

The  general  algorithm  for  sensitivity  analysis  as  described  in  [9]  proceeds 
through  four  phases: 

1.  The  dominance  phase  in  which  the  set  A\  of  non-dominated  alternatives 
is  found; 

2.  The  potential  optimality  phase  in  which  A2  the  set  of  non-dominated 
alternatives  which  could  be  optimal  for  some  in  €  5  is  found; 

3.  The  adjacent  potential  optimality  phase  in  which  A3  the  set  of  potentially 
optimal  alternatives  which  are  contenders  to  a*  for  optimality  if  a  smooth 
change  of  w  away  from  occurs,  is  found; 

4.  The  distance  analysis  phase  in  which  the  minimum  change  in  w°  required 
by  each  contender  aj  before  optimality  switches  from  a*  to  aj  is  computed 
in  both  the  /i  <Lnd  loo  metrics.  This  information,  together  with  that 
provided  by  maximum  distance  problems,  is  then  used  to  compute  an 
index  of  sensitivity  of  the  optimum  solution  [6]. 

2.3  Computational  Aspects 

Eaich  phase  requires  the  solution  of  several  mathematical  programmes,  whose 
form  depends  on  the  phase  and  the  nature  of  ^(a,',u;)  and  of  5.  In  the  lin¬ 
ear  model,  all  problems  arising  in  Phases  1-3  axe  linear  programmes.  The 
minimum  distance  problems  are  convex  and  can  be  transformed  to  linear  pro¬ 
grammes.  The  maximum  distance  problems  are  non-convex  but  the  loo  prob¬ 
lem  can  be  solved  by  linear  programming  [7].  Hence  only  the  maximum 
distance  problem  causes  difficulty. 
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In  the  bilinear  model,  the  loo  problem  again  can  be  solved  as  a  set  of  linear 
programmes  but  all  other  problems  have  a  bilinear  objective  function  and/or 
bilinear  constraints  and,  in  general,  are  non-convex.  In  the  general  model  all 
problems  are  general  nonlinear  programmes  and,  in  general,  are  non-convex. 

3  Parallel  Implementation 

The  general  algorithm  for  sensitivity  analysis  is  intrinsically  sequential  due 
to  the  precedence  relationships  between  the  phases.  Parallelism,  however,  is 
present  at  the  level  of  each  phase.  Given  the  potentially  large  number  of 
mathematical  programmes  to  be  solved,  large  grain  parallelisation  is  most  ap¬ 
propriate  especially  when  the  hardware  to  be  used  is  a  MIMD  machine,  such 
as  a  transputer  network,  because  of  the  high  ratio  of  time  spent  in  communi¬ 
cation.  Also,  transputers  being  fast  processors  with  enough  local  memory,  it 
is  attractive  to  make  them  operate  as  autonomously  as  possible. 

In  the  present  implementation,  the  processor  farm  approach  is  adopted. 
Individual  problems  arising  at  the  level  of  each  phase  are  farmed  out  to  in¬ 
dividual  nodes  by  the  master  process  which  resides  on  the  root  node.  The 
worker  process  is  replicated  on  each  node  of  the  network. 

The  master  process  consists  of  three  threads  running  concurrently:  the 
producer  which  generates  the  problems;  the  sender  thread  which  transmits 
them  to  the  network;  the  consumer  thread  which  collects  the  results  from  the 
network  and  displays  them. 

The  worker  process  consists  of  a  single  thread  which  communicates  with 
the  master  and  solves  linear  and  nonlinear  programmes. 

3.1  Equipment 

The  parallel  processing  environment  consists  of  an  Elonex  386S-200  PC  com¬ 
prising  a  TMB04  motherboard  with  a  T805  processor  and  4  Mbyte  RAM  as 
the  root  node;  4  T805  transputers  with  4  Mbyte  of  RAM  each  and  3L  Paurallel 
Software  consisting  of  a  Fortran  compiler  and  a  set  of  extension  functions  for 
message  passing,  a  task  hsirness  and  a  configuration  system.  The  topology 
of  the  network  is  a  ring  with  a  connection  between  the  free  link  of  the  root 
transputer  and  one  of  the  two  nodes  which  are  not  linked  to  it.  The  network 


is  configuring  by  file  submitted  to  the  static  configurer  or  the  flood-fill  config¬ 
urer.  Synchronisation  is  either  implicit  through  algorithm  design  or  explicit 
using  descheduling  functions. 

In  order  to  use  this  system,  codes  for  linear  and  general  nonlinear  program¬ 
ming  are  required.  For  linear  programming,  a  simplex-based  routine  has  been 
developed.  For  the  general  nonlinear  programming  problem,  the  multi-level 
single  linkage  algorithm  of  Rinnoy-Kan  and  Timmer  [8]  was  zulopted.  It  relies; 
on  a  local  optimisation  routine  based  on  a  sequential  quadratic  programming 
algorithm  [4]. 

3.2  Parallel  Algorithm 

The  algorithm  of  the  master  process  consists  of  three  procedures  which  run 
concurrently. 

Producer  thread: 

Read  data  of  the  considered  model  of  decision  analysis; 

Broadcast  initial  data  to  all  workers; 

Start  Sender, 

Start  Consumer; 

Rank  alternatives  according  to  *^(a,,  u;)  and  w°; 

Start  sensitivity  analysis; 

For  Phase  =  1 :4  do 

a: Generate  problem  in  Phase; 

Put  under  standard  message  form; 

Wait  for  Sender  to  be  ready; 

Pass  message  to  Sender, 
go  to  a:; 

Endfor; 

When  all  phaises  have  been  considered  send  Stop-message  to 
Sender  and  Workers; 

Send  Stop-message  to  Consumer  through  a  COMMON  variable; 
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Sender  thread: 

b:Wait  for  an  incoming  message  from  Producer, 

Receive  message; 

When  the  network  pre^nts  an  idle  processor,  pass  message  to  it; 
If  message  is  a  Stop-message  then  stop; 

Otherwise  go  to  b: 


Consumer  thread: 

c:Wait  for  an  incoming  message  from  the  network; 
Receive  message; 

Process  it  and  display  results; 

Check  for  Stop  if  current  phase  is  the  last; 

Or  else  go  to  c: 


4  Conclusion 

Computational  results  show  that  the  framework  is  viable  for  linear  models  of 
practical  size.  For  the  bilinear  models,  limited  experience  shows  performance 
gains  due  to  parallelisation.  Two  strategies  for  solving  the  global  optimisation 
problems  in  parallel  were  tried;  the  first  strategy  kept  the  algorithm  for  global 
optimisation  as  part  of  the  master  process  and  thus  only  local  optimisation 
problems  were  farmed  out.  In  the  second  the  global  optimisation  problems 
themselves  were  sent  to  the  workers,  thus  making  the  global  optimisation 
routine  part  of  the  worker  process.  Although,  better  results  were  obtained 
with  the  second  strategy,  in  both  cases  the  speed  up  was  less  than  linear. 
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1  Introduction 

The  Global  Optimisation  (Minimisation)  Problem  can  be  expressed  as  follows: 
Let  /  be  a  function  from  to  R  and  A  C  H",  then  find  a;*  t  A  such  that 
Vi  €  A,  fix*)  <  fix). 

When  /  is  unimodal,  a  suitable  algorithm  may  be  found  among  the  numer¬ 
ous  algorithms  for  local  optimisation,  depending  on  properties  such  as  conti¬ 
nuity  and  differentiability  of  /,  compactness  of  A  etc.  When  /  is  multimodal 
and/or  lacking  the  above  properties,  there  is  no  method  which  is  guaranteed 
to  find  its  global  minimum.  A  difficulty  is  that  no  practically  useful  test  for 
optimality  is  available,  as  is  the  case  for  smooth  local  optimisation,  so  that, 
in  principle,  one  has  to  evaluate  the  function  at  every  point  of  A,  which  is  not 
possible.  As  Schoen  [11]  says  the  problem  is  "inherently  intractable”. 

Despite  the  difficulty  of  the  problem,  we  are  witnessing  an  ’explosion’  in 
the  design  of  algorithms  for  GO.  As  a  consequence  it  is  not  easy  for  the  user 
to  choose  a  suitable  algorithm  for  his/her  application,  particularly  if  the  algo¬ 
rithm  is  to  be  embedded  in  a  general  purpose  package.  Despite  many  reviews 
of  GO  techniques,  the  experimental  record  is  patchy  but  points  to  the  relative 
success  of  stochastic  algoiithms.  However,  there  are  many  algorithms  in  this 
class. 

In  the  following  we  shall  be  concerned  with  representatives  of  two  popular 
stochastic  algorithms  for  global  optimisation,  namely  clustering  and  simulated 
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annealing.  Issues  related  to  their  implementation  and  use  to  solve  practical 
problems  involving  general  constraints  in  addition  to  simple  bounds  on  the 
variables  will  be  discussed.  Computational  results  will  be  given. 

2  Stochastic  Algorithms 

Stochastic  algorithms  generally  employ  both  heuristic  and  probabilistic  meth¬ 
ods  to  provide  an  approximation  to  the  global  optimum.  The  major  compo¬ 
nents  of  such  algorithms  are: 

•  sampling  -  generate  a  random  sample  of  points  in  A 

•  local  optimisation  -  apply  a  local  optimisation  algorithm  to  /  starting 
from  each  of  a  subset  of  the  sampled  points 

•  stopping  rules  -  decide  whether  to  draw  another  sample  or  whether  to 
accept  the  smallest  observed  value  of  /  in  A  as  an  approximation  to  the 
global  minimum. 

Tbs  quality  of  the  approximation  usually  can  be  traded-olF  against  the  runtime 
of  the  algorithm  by  adjusting  one  or  more  parameters  which  control  the  above 
components.  However  there  is  little  or  no  systematic  knowledge  on  which  to 
base  such  tradeoffs.  In  the  c£ise  of  constrained  global  optimisation,  A  is  often 
relaxed  to  a  hypercube  in  order  to  ease  the  task  of  sample  generation.  We 
then  rely  on  the  local  optimiser  to  find  fe^l3ibIe  points. 

2.1  Clustering 

Cluster  analysis  is  usually  associated  with  pattern  recognition  where  objects 
belonging  to  an  initial  set  are  separated  into  subsets  or  clxisters  according  to 
some  similarity  characteristic,  say  shape  or  colour  etc...  In  GO  it  was  intro¬ 
duced  by  Becker  and  Lego  [2]  in  the  early  70’s  in  order  to  reduce  computing 
time  in  the  multistart  algorithm  [15].  The  basic  idea  is  to  try  to  identify 
clusters  of  points  which  will  lead  to  the  same  local  minimum.  By  starting  a 
local  optimisation  from  a  representative  of  each  cluster  each  distinct  local  min¬ 
imum  hopefully  will  be  discovered  only  once.  The  practical  implementation  of 
this  idea,  however,  depends  on  the  choice  of  many  parameters  which  make  up 
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the  Threshold  Distance  on  which  clustering  depends.  Negotiating  the  difficult 
choice  of  these  parameters  and  adequate  stopping  rules  remains  the  key  to  the 
efficient  implementation  of  clustering  algorithms.  Recent  attempts  have  been 
made  to  improve  clustering  algorithms  by  adequate  choice  of  S2unple  size  [12] 
and  use  of  topographical  information  of  the  search  space  [13].  However,  no 
experimental  results  yet  provide  strong  support  for  these  improvements. 

The  multi-level  single  linkage  algorithm  of  Rinnooy  Kan  and  Timmer  [10] 
appears  to  be  one  of  the  most  efficient  amongst  stochastic  algorithms  for  GO. 
Tests,  however,  appear  to  be  confined  to  problems  of  small  dimension  and 
with  only  simple  bound  constraints.  An  implementation  of  this  algorithm 
will  be  considered  here.  A  somewhat  simpler  clustering  method  due  to  Torn 
and  Viitanen  [14]  is  also  considered  since  it  has  potential  advantage  for  the 
application  which  motivated  this  research. 

2.2  Simulated  Annealing 

The  concepts  of  annealing  in  optimisation  were  introduced  in  the  early  80’s 
[9],  [5].  Initially  interest  was  in  their  application  to  solving  combinatorial 
optimisation  problems  [1].  The  simulated  annealing  based  algorithm  (SA)  has 
been  reported  [9],  [8]  to  perform  well  on  such  problems  in  high  dimensions 
with  a  large  number  of  local  minima.  Based  on  this  success,  variants  of  SA 
for  continuous  global  optimisation  have  been  developed  [16],  [4],  [6],  [7].  SA 
is  a  stochastic  method  by  means  of  which  the  global  minimum  of  a  function 
/,  regardless  of  its  continuity  or  differentiability,  can  be  appapi^ed  as  close 
as  one  desires.  The  main  feature  of  the  algorithm  is  its  ability  to  'distinguish’ 
between  the  fine  behaviour  and  the  gross  behaviour  of  the  objective  function. 
Assuming  that  fine  behaviour  leads  to  poor  local  minima  with  small  regions 
of  attraction,  the  algorithm  detects  it  and  avoids  getting  trapped  by  taking 
uphill  steps,  thus  allowing  the  function  value  to  increase  momentarily.  This 
strategy  is  based  on  what  is  called  the  acceptance/rejection  rule.  To  draw  a 
parallel,  note  that  the  strategy  of  the  multistart  algorithm  relies  on  starting  a 
local  search  from  different  points. 

Accepting  or  rejecting  uphill  steps  in  SA  is  determined  by  a  sequence  of 
random  points  with  a  controlled  probability  mtn(  1 ,  e  [7]^  where  T  is 

the  control  parameter.  This  parameter  is  crucial  as  it  slows  down  the  algorithm 
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if  it  is  too  high  and  it  removes  the  global  aspect  of  the  algorithm,  i.e.  uphill 
moves,  if  it  is  too  small  [1],  [8],  [11].  Besides  parameter  T  which  requires  an 
initial  value,  a  decrement  function  for  decreasing  it  and  a  final  value  to  use  in 
the  stopping  condition,  another  parameter  is  also  required  to  be  set  for  any 
practical  implementation  of  the  algorithm.  This  parameter  is  the  length  L 
of  each  Markov  chain  corresponding  to  each  sequence  of  decreasing  T.  The 
concept  of  finite  Markov  chains  is  used  to  derive  a  mathematical  model  of  SA, 
given  that  in  SA  the  outcome  of  a  trial  depends  only  on  the  outcome  of  the 
previous  trial  [Ij.  This  set  of  parameters  is  usually  referred  to  as  a  cooling 
schedule  [7],  (Ij. 

The  simulated  annealing  algorithm  implemented  in  the  present  work  was 
based  on  the  Dekker  and  Aarts  variant  [7].  The  principal  difference  is  in  the 
generation  of  new  points  from  a  given  point.  While  in  [7]  this  was  accomplished 
using  a  local  search  procedure,  we  adopt  a  Hit-and-Run  algorithm  for  detecting 
non-redundant  constraints  [3].  This  is  mainly  because  the  search  spaces  of  the 
problems  to  be  solved  are  not  defined  just  by  simple  bound  constraints. 


3  Experimental  Work 

The  test  problems  used  in  the  experiments  are  practical  problems  arising  in 
decision  analysis.  They  are  by  no  means  extremely  difficult,  but  they  present 
nondifferentiability,  nonlinearity  in  the  objective  function  or  the  constraints  or 
both.  Above  all,  they  have  many  minima  and  nontrivial  constraints  and  are 
of  higher  dimension  than  is  often  encountered  in  the  literature.  Comparative 
results  obtained  with  Fortran77  codes  for  the  three  algorithms  cited  above  will 
be  reported. 
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1  Introduction. 

An  important  class  of  simple  and  efficient  algorithms  for  optimizing  a  function  /  on  a  set  5  is  the 
class  of  greedy  (or  myopic)  algorithms.  Since  the  work  of  Edmonds  [14],  [15]  on  matroids  and  of 
Hoffman  [23]  on  transportation  problems,  numerous  authors  have  studied  conditions  on  /  and  S 
which  guarantee  the  convergence  of  greedy  algorithms  to  optimal  solutions.  In  the  ca.'se  where  / 
is  linear  and  5  is  a  polyhedron,  two  broad  and  welUknown  classes  of  linear  programs  have  been 
shown  to  be  optimally  solvable  by  a  greedy  algorithm.  Following  the  work  of  Edmonds,  the  first 
class  includes  optimization  problems  on  polymatroids  and  related  submodular  polyhedra,  see 
Frank  and  Tardos  [17]  and  Fujishige’s  monograph  [19]  for  in-depth  studies.  On  the  other  hand, 
following  the  work  of  Hoffman,  the  second  class  includes  transportation  problems,  both  ordinary 
and  multi-index,  with  cost  coefficients  satisfying  some  form  of  a  so-cidled  Mange  condition  (see 
Hoffman  |25|  ^uld  Bein  et  al.  [11]  for  detaib). 

In  this  paper,  we  present  a  dual  pair  of  linear  programs,  in  which  the  variables  are  associated 
with  the  elements  in  a  sublattice  of  a  discrete  product  lattice.  We  show  that  a  greedy  algorithm 
solves  both  primal  and  dual  programs  when  the  cost  coefficients  in  the  primal  problem  (or, 
equivalently,  the  right  hand  sides  in  the  dual  problem)  are  given  by  a  submoduiar  function  on 
the  sublattice.  The  primal  problem  generalizes  the  multi-index  transportation  problem  of  Bein 
et  al.  [11]  to  the  case  of  forbidden  arcs  with  a  sublattice  structure.  The  dual  problem  generalises 
the  linear  optimization  problems  on  submodular  polyhedra  by  Lov&sz  [29]  and  Fujishige  and 
Tomizawa  [20]  to  a  distributive  sublattice  of  a  finite  product  space. 

In  addition  to  enlarging  the  class  of  linear  programs  solvable  by  a  greedy  algorithm,  our 
work  also  unifies  heretofore  separate  streams  of  research  and  highlights  the  duality  relationship 
between  (multi-index)  transportation  problems  and  linear  optimization  problems  on  submodular 
polyhedra.  In  particular,  we  observe  that  submodularity  and  the  Monge  condition  are  the  same 
concept  expressed  in  different  forms.  Indeed,  known  results  on  lattices  and  submodular  functions 
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are  independently  re-diacovered  in  the  context  of  the  Monge  condition  for  multi-dimensional 
arrays.  Conversely,  new  results  for  Monge  matrices  also  apply  to  submodular  functions. 

The  contents  of  this  paper  are  as  follows.  In  Section  2,  we  define  the  dual  pair  of  linear 
programs  which  is  the  object  of  this  paper.  We  show  how  they  generalize  multi-index  trans¬ 
portation  problems  and  linear  optimization  problems  on  submodular  polyhedra.  In  Section  3, 
we  present  the  greedy  algorithm,  and  prove  that  it  produces  optimal  solutions  to  the  primal  and 
the  dual  problems.  Finally,  in  Section  4,  we  discuss  relations  between  submodularity,  the  Monge 
condition  for  a  matrix  and  the  existence  of  Monge  sequences. 

2  Lattices,  submodular  functions,  and  a  dual  pair  of  linear 
programs. 

Let  the  integer  k  >  ?.  denote  the  dimension  of  the  product  lattice  defined  below,  and  K  = 
(1, . . .  ,A:}.  For  i  €  K,  let  be  a  totally  ordered  set  (or  chain)  with  m(»)  +  1  elements.  For 
simplicity,  we  let  A,  =  {0, 1, . . . , m(t)},  with  the  usual  total  order  0  <  1  <  •••  <  m(i),  for  all 
t  6  K.  The  product  space  >4  =  /li  x  Aj  x  •  •  •  x  At  is  a  distributive  lattice  with  join  and  meet 
operations  defined  by 


a  v6  -  (max{a(l),6(l)},  ...  ,max{a(I:),6(fc)}) 

and 

a  a6  =  (min{a(l),6(l)},  ...  ,min{a(fc),6(A:)}), 

respectively,  where  a  and  b  are  any  elements  of  A.  As  is  well-known  in  lattice  theory  (see 
references  below),  these  operations  induce  the  usual  partial  order  “<"  on  this  lattice  A  by 

a<b  <=>  a  —  aAb  {<=>  b  =  avb). 

Let  0  =  (0,  ...,0]  and  m  =  (m(l), . . . ,  m(^))  denote  the  smallest  and  l2U'gest  element  of  A, 
respectively. 

Let  B  denote  any  subset  of  A.  For  any  i  &  K  and  j  G  A.  we  define  the  section  B(i,j)  of  B 
at  (i,.;)  as  B{i,j)  =  {a  €  B  :  a(»)  =  j}.  For  any  o  e  A  we  define  the  (lower)  truncation  Ba 
of  B  at  a  as  Ba  =  {b  €  B  :  6  <  o}. 

A  subset  B  of  A  is  a  sublattice  if  for  every  a, 6  6  B  we  have  a  V 6  €  B  and  aAb&B.  If  B  is 
a  sublattice  then  the  sections  B{i,j)  and  the  truncations  B,  are  also  sublattices,  for  all  i  €  K, 
j  €  A,-  and  a  €  A. 

A  real-valued  function  f  :  B  >-*  Z  on  a  sublattice  B  is  submodular  if  the  following  submodular 
inequality 

/(av6)-h/(oA6)</(o)-h/(6) 

holds  for  all  a,i  €  B.  It  is  strictly  submodular  if  this  inequality  is  strict  whenever  a  V  6  ^ 
{a,  6}.  See,  for  example,  Birkhoff  [12]  for  a  general  exposition  of  lattice  theory,  and  Topkis  [37], 
Veinott  [38]  and  Granot  and  Veinott  [21],  and  the  references  therein,  about  product  lattices,  their 
sublattices,  and  submodular  functions  (called  “subadditive”  functions  in  the  latter  reference). 

Let  B  be  any  subset  of  A  and  let  B*  =  B  \  {0}.  We  associate  a  cost  w{a)  with  every 
element  a  €  B*,  and  a  nonnegative  demand  dij  with  every  section  B{i,j)  where  i  e  K  and 

i6  a;  =  A,\{o}. 
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We  now  formulate  a  dual  pair  of  linear  programs  (P)  and  (D): 

(P)  min 

s.t.  ^  I*  =  d,y  for  I  e  if  and  j  &  A^; 

aea(ijj 

>  0  for  all  a  SB*; 

^hnd 

(D)  max  ^  ^  dijs/ij 
ieK  jeAT 

s.t.  ^2  Vt.tt(<)  <  u'(a)  for  all  a  6  B*. 

•€K 

Multi-index  transportation  problems.  The  primal  problem  (P)  just  defined  contains  as 
a  specieil  case  the  following  axial  k -index  tranaportation  problem  with  forbidden  arcs.  The  k  sets 
AJ , . . . ,  may  be  interpreted  as  sets  of  sources,  destinations,  types  of  goods,  and  various  related 
resources.  Let  B  denote  the  subset  of  A'  ==  Aj  x . . .  x  consisting  of  non-forbidden  (pernnissibicl 
combinations  a  €  A'.  With  each  section  B{i,j)  we  associate  a  nonnegative  “demand”  (which 
may  be  interpreted  as  a  supply  when  A^  is  a  set  of  sources,  and  as  a  capacity  when  A,*  is  a  set 
of  resources).  It  is  assumed  that  IZygxr  4,j  =  D,  a  constant  for  ail  i  6  K.  With  each  element 
a€  B,  often  called  an  “arc” ,  we  associate  a  cost  rate  w(a)  and  a  nonnegative  decision  variable 
Xa  representing  the  amount  of  “demand”  which  is  satisfied  by  the  corresponding  combination. 
The  (axial)  A;-index  transportation  problem  is  to  determine  the  amount  associated  vdth  each 
permissible  arc  a  €  B  so  as  to  satisfy  exactly  the  demand  of  each  sec  tion  B{i,j)  (for  all  i  <=■.  K 
and  j  €  A*)  at  minimum  total  cost.  This  problem  may  be  formulated  precisely  as  an  instence 
of  problem  (P). 

The  case  where  B  =  A'  (no  forbidden  arcs)  is  the  axial  multi-index  transportation  problem 
defined  by  Haley  [22]  (see  also  Chapter  8  in  |41],  and  [11]).  The  axial  k-index  assignment 
problem  arises  when  all  m(i)  are  equal  to  a  constant  m,  all  demands  are  equal  to  1,  and  ail 
decision  variables  are  restricted  to  be  either  0  or  1,  see,  e.g.,  Pierskalla  {32j  and  Bandelt 
al.  [9].  When,  in  addition,  ^  =  3,  we  have  the  much  studied  (axial)  three-dimensional  aasignmrnt 
problem,  see  Frieze  and  Yadegar  (18{,  Balas  and  Saltzman  [8],  and  Crama  and  Spieksma  jl3j. 
The  above  references  describe  several  practical  applications  of  these  different  models  in  such 
areas  as  logistics,  automated  production,  statistics,  and  course  scheduling. 

Note  that,  in  addition  to  offering  a  compact  notation  (compared  to  the  above  references),  our 
formulation  of  problem  (P)  allows  us  quite  naturally  to  exclude  forbidden  arcs.  Ordinary  (i.e.,  2- 
index)  transportation  problems  with  forbidden  arcs  were  considered  by  Shamir  and  Dietrich  [36] 
in  connection  with  the  existence  of  Monge  sequences;  see  Section  4  for  details.  Note  also  that 
we  do  not  need  to  require  in  problem  (P)  that  the  total  demand  ICyea*  constant  for  all 

i  €  K. 

Submodular  polyhedra.  When  m{i)  =  1  for  all  i  6  K,  the  lattice  A  may  be  identified 
with  the  lattice  2^  of  all  subsets  of  K.  Sublattices  B  are  then  (distributive)  set  lattices,  .lucl 
submodular  functions  coincide  with  those  now  well-known  in  combinatorial  optimization  (see. 
e.g.,  Nemhauser  and  Wolsey  |30|).  The  constraints  of  problem  (D)  are  then  preebely  those 
which  define  a  submodular  polyhedron  as  in  Fujishige  [19]  (see  also  Frank  and  Tardos  [17]). 
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When  B  =  A  =  2*^  we  fa&ve  the  problem  which  Lovwz  [29]  shows  to  be  solvable  by  a  greedy 
^ilgorithni.  These  polyhedra  are  closely  related  tu  the  polymatroida  introduced  by  Edmonds  [15], 
see  the  preceding  references  for  details. 

Problem  (D)  properly  generalizes  these  submodular  polyhedra  by  allowing  each  chain  A,  in 
the  lattice  to  contain  any  number  of  elements,  giving  rise  to  arbitrary  (finite)  product  lattices. 
This  is  akin  to  extending  attention,  in  integer  programming,  from  binary  variables  to  general 
integer-vaiued  variables.  We  refer  to  the  work  of  Topkis,  Veinott,  and  Granot  and  Veinott  cited 
above  for  a  description  of  some  of  the  problems  amenable  to  this  broader  framework. 

The  Greedy  Algorithm  in  the  next  section  generalizes  on  one  hand  those  of  HoflPman  ruid 
3ein  e^.  al.  for  (ordinary  and  multi-index)  transportation  problems,  and  on  the  other  hand 
tnose  in  Luvasz  and  in  Fujishige  and  Tomizawa  for  subntodular  polyhedra  (the  latter  two  being 
themselves  generalizations  of  that  of  E)dmonds  for  polymatroids) . 


3  A  greedy  algorithm. 

Ws  first  describe  the  input  and  output  of  our  algorithm  for  problems  (P)  and  (D).  Throughout 
this  Section,  we  assume  that  B  is  any  sublattice  of  A. 

Input: 


integer  k 

integers  m(l), . . . ,  m{k) 
reals  d^y  >  0 
oracle  MAXLE 
oracle  vi 


the  dimension  of  A; 

defining  the  range  of  each  coordinate  of  A; 
demands,  for  all  i  e  K,  j  6  A,* ; 
describing  sublattice  B  (see  explanations  below); 
retum'mg  the  value  w(h)  for  any  b€  B. 


Output: 

variable  Status  indicating  the  status.  Feasible  or  Infeasible,  of  problem  (P); 

and  if  Status  =  Feasible: 

list  ((6‘,  i*i ), . . . ,  (6^, ijr))  describing  a  primal  solution  (see  below); 

reals  y,y  >  0  describing  a  dual  solution,  for  all  i  e  ff  and  y  €  A,^ 

The  sublattice  B  might  be  presented  in  different  ways,  such  as:  a  list  of  all  elements  in  B  (the 
permissible  elements  or  cells);  a  list  of  ail  elements  in  A  \  B  (the  forbidden  elements);  collections 
of  conditions  (for  example,  monotone  linear  inequalities,  see  |38|)  characterizing  pernussible  or 
forbidden  elements;  and  so  forth.  However,  to  achieve  sufficient  generality  and  to  exploit  the 
intrinsic  simplicity  of  the  Greedy  Algorithm,  we  use  the  following  oracle,  which  we  call  MAXLE. 
The  input  to  MAXLE  consists  of  any  element  a  e  A.  Oracle  MAXLE  then  returns  “NULL” 
if  the  truncation  B,  of  B  generated  by  a  is  empty;  else  it  returns  V  Ba,  the  largest  element  of 
6  €  B  such  that  b  <  a.  We  leave  it  to  the  interested  reader  which  data  structures  can  be  used 
to  efficiently  implement  this  oracle  for  a  given  representation  of  the  sublattice  B,  such  as  one  of 
those  outlined  above. 

The  output  to  the  Greedy  Algorithm  exploits  the  sparseness  of  the  basic  solutions  to  prob¬ 
lem  (P).  Although  the  primal  solution  vector  z  has  one  component  Zt  for  every  element  be  B’ 
(and  hence  potentially  up  to  ( nt(*^(0  +  ^))  ~  ^  variables),  at  most  ^  m(i)  of  these  will  assume 
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a  positive  V2due.  Thus  the  Greedy  Algorithm,  which  will  be  shown  to  produce  a  dual  pair  of 
basic  solutions  to  problems  (P)  and  {D),  returns  a  list  of  T  pairs  (6,  x^)  with  be  B*  and  xt  is  the 
value  of  the  corresponding  variable  in  the  solution.  The  number  T  of  such  pairs  is  determined 
by  the  algorithm,  but  will  be  shown  not  to  exceed  m(i).  It  is  understood  that  =  0  for  ail 
he  B*  which  do  not  appear  in  this  list. 

The  Greedy  Algorithm  deteuled  belw  consists  of  two  phases.  The  Primal  Phase  repeats  the 
following  step:  identify  (using  the  MAXLE  oracle)  the  largest  available  element  h  e  B'  and 
assign  the  largest  possible  value  to  its  variable  z^.  This  step  is  repeated  until  either  infeasibility 
is  detected,  in  which  case  the  algorithm  halts,  or  ail  demands  are  satisfied.  In  the  latter  case,  the 
list  of  (6,Zk)  pairs  output  by  the  algorithm  defines  a  feasible  primal  solution.  The  Dual  Phase 
then  traces  back  the  sequence  of  elements  b  e  B  recorded  in  the  Primal  Phase  to  construct 
(using  the  u>  oracle)  a  dual  solution  y. 

GREEDY  ALGORITHM: 

Primal  Phase. 

0.  (Initialize:) 

Let  6ij  :=  d,,  for  all  i,j; 

A  :=  A;  a  :=  MAXLE(fn);  t  :=  0; 

1.  (Main  step): 

Repeat 

if  (a  ^  NULL)  then  { 

if  (there  exists  {i,j)  with  j  >  a(t)  and  >  0)  then 
return(5tattts  :=  Infeasible) 
else  ( 
t:=t+l; 

let  i  =  argmin{5f  €  A); 

•“  ^«,a(«)i  ^(0  *i 

add  (a(,Za,)  :=  (a,Za)  to  the  output  list; 

let  6t,a(t)  ■=  ^e,a(i)  -  Xa  for  all  /  e  A; 

a(i)  :=  a(i)  -  1;  if  a(i)  =  0  then  let  A  ;=  A  \  (i); 

let  a  :=  MAXLE(a); 

} 

} 

until  (a  =  0  or  a  =  NULL); 

2.  (Final  test  for  infeasibility): 

If  (a  =  NULL  and  there  exists  («,y)  with  Sfy  >  0)  then 
retum(5(a(us  :=  Infeasible); 

else  {  let  T  :=  t;  output  the  list  (oi.Xa, ),..., (ay, z,^);  } 

Dual  Phase. 

For  t  .=  T  down  to  1  do 
output  y,(, )..,(,(,))  := 

«"(<‘f)  “  Ei^au)  :  all  «  >  t  with  au(>}(u))  =  a(i7(u))}; 

Return(5tatvs  =  Feasible). 


Note  that,  when  applied  to  an  ordinary  (two-dimensional)  transportation  problem,  the  Pri¬ 
mal  Phase  reduces  to  the  well-known  North-West  comer  rule  (with  an  appropriate  geographic 
orientation  of  the  transportation  array).  More  generally,  the  Primal  Phase  reduces  to  the  greedy 


494 


aJgorithm  of  Bein  et  al.  [11]  for  multi-index  transportation  problems  without  forbidden  arcs.  In 
the  case  where  problem  (D)  is  a  linear  optimization  problem  over  a  submodular  polyhedron, 
(that  is,  Ai  =  {0,1}  for  ail  t  €  K),  the  Primal  Phase  amounts  to  sorting  the  dn  values  in  a 
nondecreasing  order,  connstent  with  the  sublattice  B  in  the  following  sense:  if  6(i)  =  1  for  all 
6  e  B  with  6(A)  =  1,  and  if  ri{t)  =  i  and  17(a)  =  A,  then  t  >  u  (for  ail  A,t  €  K).  Then,  in 
the  Dual  Phase,  the  y-variabiea  are  sequentially  maximized  according  to  this  sequence,  with 
V»j(t).<«.(*»(0)  •”  “'(“»)  “  5Zu>«  «'(*«)•  Thus  the  Greedy  Algorithm  just  presented  reduces  to  that 
of  Lovuz  |29j  and  of  Fujishige  and  Tomizawa  |20]  in  the  case  of  submodular  polyhedra. 

Theorem  3.1  Let  B  be  a  sublattiee  of  a  finite  product  space. 

(1)  The  Greedy  Algorithm  returns  Status  =  Infeasible  if  and  only  if  problem  {P)  is  infeasible. 

(S)  If  problem  (P)  is  feasible,  then  the  Greedy  Algorithm  outputs  an  optimal  solution  to 
problem  (P)  for  all  nonnegative  demands  d  if  and  only  if  w  is  submodular. 

(S)  If  problem  (P)  is  feasible  and  w  is  submodular,  then  the  Greedy  Algorithm  outputs  an 
optimal  solution  to  problem  (D)  in  the  Dual  Phase. 

Note  that,  for  a  given  demand  vector  d  such  that  problem  (P)  is  feasible,  the  primal  solution 
constructed  by  the  Greedy  Algorithm  does  not  depend  on  the  cost  function  to,  provided  it 
is  submodular.  See  jl]  and  [2]  for  a  study  of  a  similar  property  in  the  context  of  ordinary 
transportation  and  minimum  cost  network  flow  problems. 

A  consequence  of  Theorem  1  is  that,  when  w  is  submoduittr,  the  inequalities  of  problem  (D) 
form  a  totally  dual  integral  (TDI)  linear  inequality  system  (see  Edmonds  and  Giles  [16],  Hoffman 
[24],  and  Nemhauser  and  Wolsey  (30)  for  a  definition  and  properties  of  TDI  systems). 

4  Submodular  costs  and  Monge  properties. 

The  meun  purpose  of  this  section  is  to  point  out  and  exploit  the  equivalence  between  submodu¬ 
larity  of  a  function  defined  on  a  product  of  k  chtuns,  and  the  Monge  condition  of  a  A-dimensional 
array.  VVe  also  introduce  the  concept  of  submodular  sequences,  and  discuss  its  relationship  with 
that  of  Monge  sequences  in  two-dimensional  arrays.  In  particular,  we  show  that,  for  any  strictly 
°ubmodular  two-dimensional  array,  the  class  of  submodular  sequences  coincides  with  that  of 
Monge  sequences. 

The  concept  of  Monge  sequences  was  introduced  for  two-dimensional  arrays  (matrices)  by 
Hoffman  in  1963  (23)  in  order  to  describe  classes  of  transportation  problems  that  are  greedily 
solvable.  A  Monge  sequence  for  a  two-dimensional  n  x  m  array  C  =  (c(i,y))  is  a  total  ordering 
of  the  nm  pairs  (i,j)  such  that,  whenever  pair  (I'.j)  precedes  both  pairs  (i,£)  and  (k,j), 

c(i',jj  +  c(*,/)  <  c(i,f]  +  c(A,jj. 

The  inequality  in  this  definition  has  been  independently  observed  and  exploited  in  algorithms 
for  a  wide  variety  of  problems,  under  various  conditions  about  the  indices  i,j,k  and  t.  The 
most  common  condition  is  that  i  <  k  and  j  <  t:  n  two-dimensional  array  C  satisfies  the  Monge 
condition  if 

holds  for  all  i,j,k,t  with  i  <  k  and  j  <  t.  Note  that  this  definition  is  precisely  that  of  the 
submodularity  of  the  function  defined  by  C  on  the  product  lattice  { 1, . . . ,  n)  x  (1, . . . ,  m}. 

When  the  only  defined  entries  in  the  matrix  are  above  the  diagonal  (thus  forming  a  sublattice 
of  the  lattice  A  of  matrix  cells),  this  condition  is  also  called  the  (concave)  quadrangle  inequality 
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(e.g.,  (40]),  and  (as  if  to  add  to  the  confusion)  functions  defined  by  such  an  array  are  sometimes 
called  concave  functions  (e.g.,  [28]).  The  computer  science  community  has  seen  a  flurry  of 
activity  on  these  and  closely  related  concepts  during  the  past  few  years.  This  activity  was 
motivated  in  part  by  the  seminal  paper  of  Aggarwal  et  al.  [4]  on  matrix  searching,  and  also  by  a 
wide  variety  of  applications  to  problems  in  such  areas  as  computational  geometry  (e.g.,  [4],  [3]), 
VLSI  channel  routing  (e.g.,  [4],  [5]),  signal  quantization  [39],  molecular  biology  (e.g.,  [35],  [28]), 
dynamic  lot  sizing  (the  Wagner- Whitin  problem)  [7],  and  the  travelling  salesman  problem  [31]. 
Because  the  field  is  now  so  vast,  we  have  only  given  here  a  few  indicative  references.  Further 
references  can  be  found  therein. 

The  concept  of  Monge  arrays  has  recently  been  extended  to  ib-dimensional  arrays  by  Aggarwal 
euid  Park  [5] ,  [6] :  a  k-dimensional  array  C  satisfies  the  Monge  condition  if  every  two-dimensional 
plane  of  C  corresponding  to  fixed  values  of  ib  —  2  coordinates  satisfies  the  Monge  condition. 

Actually,  the  Monge  condition  just  defined  is  equivrdent  to  the  submoduleu’ity  of  the  function 
c  :  A*-*  fi  defined  by  the  array  C.  Indeed,  the  following  result  (compare  Proposition  2.4  in  [5]) 
is  a  direct  rephrasing  of  Theorems  3.1  and  3.2  in  Topkis  [37]  for  the  case  where  A  is  a  product 
of  a  finite  number  of  chains,  as  is  assumed  throughout  the  present  paper: 

Theorem  4.1  A  function  c  :  A  >-*  R.  is  subtnodular  if  and  only  if  it  is  submodular  on  every 
two-dimensional  sublattice  (plane)  corresponding  to  fixed  values  of  k  —  2  coordinates. 

As  a  consequence,  many  results  on  k-dimensional  arrays  satisfying  the  Monge  condition  (such 
as  in  Section  2  of  [5]  and  in  Section  2  in  [ll])  can  be  directly  derived  from  known  results  on 
submodular  functions.  In  addition,  the  deep  theory  of  parametric  lattice  programming  developed 
in  Topkis  [37]  also  applies  to  problems  involving  k-dimensional  arrays  satisfying  the  Monge 
condition.  Conversely,  the  rich  computer  science  literature  on  Monge  and  related  arrays,  in 
p2uticuiar  the  fast  algorithms  deve?  sped  for  a  variety  of  such  problems,  may  also  be  exploited 
to  study  submodular  functions  on  discrete  product  lattices.  (A  case  in  point  is  the  dynamic 
lot-sizing  problem  considered  in  detail  in  Topkis,  and  for  which  fast  algorithms  were  derived 
in  [7]  using  the  Monge  condition.)  The  problem  of  recognizing  the  Monge  condition,  and  hence 
submodularity,  is  considered  by  Ruediger  [74]  for  k-dimensiontd  arrays. 

We  now  go  back  to  Monge  sequences.  This  concept  was  originally  introduced  by  Hoffman 
for  two-dimensional  arrays,  as  seen  above.  It  was  further  investigated,  among  others,  in  [36], 
[l],  [2],  and  is  closely  related  to  the  notion  of  greedoids  (see  [27]). 

Some  authors  (e.g.,  Bein  et  al.  [10],  and  Ruediger  [33])  have  recently  investigated  the  possi¬ 
bility  of  extending  this  concept  to  higher  dimensions.  However,  at  present  no  approach  seems  to 
clearly  dominate.  Hence  we  restrict  our  discussion  of  relations  between  submodularity  and  the 
existence  of  Monge  sequences  to  the  two-dimensional  case.  The  following  notion  arises  naturally 
in  the  lattice  framework:  a  submodular  sequence  for  a  two-dimensional  n  x  m  array  C  =  (c[i,j']) 
is  a  total  ordering  of  the  nm  pairs  (1,7)  such  that,  for  any  i,j,k  and  t,  at  least  one  of  the  pairs 
(1,7)  A  (Jb,  f)  or  (1,7)  V  {k,t)  precedes  both  pairs  (<,7)  and  {k,t). 

Proposition  4.1  A  two-dimensional  array  is  submodular  (or,  equivalently,  satisfies  the  Monge 
condition)  if  and  only  if  every  submodular  sequence  is  a  Monge  sequence. 

It  is  not  true  that,  in  a  submodular  array,  every  Monge  sequence  is  a  submodular  sequence. 
For  example,  consider  a  constant  array  C  where  all  entries  c[i,7]  have  the  same  value.  Then 
every  sequence  of  the  pairs  (1,7)  is  a  Monge  sequence.  However,  the  next  result  shows  that 
Monge  sequences  coincide  with  submodular  sequences  when  C  satisfies  a  strict  version  of  the 
Monge  condition: 
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Proposition  4.2  If  a  tvio-dimentional  array  C  definea  a  strictly  tubmodular  function  then  a 
total  ordering  of  the  pairs  {i,j)  is  a  oubmodular  aeguenee  if  and  only  if  it  ia  a  Monge  aequence. 
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Let  G  =  (N,E)  be  an  edge-weighted  undirected  graph  with  node  set  N  =  edge 

set  E  and  weights  Wij,  ij  g  E.  We  consider  the  problem  of  partitioning  the  node  set  N  Into 
k  disjoint  subsets  Si,. .  .,Si,  of  specified  sizes  mi  >  mj  >  ...  >  mi,,  mj  =  n,  so  as  to 
minimize  the  total  weight  of  the  edges  connecting  nodes  in  distinct  subsets  of  the  partition.  This 
problem  is  well  known  to  be  NP-hsird  and  therefore  finding  an  optimal  solution  is  likely  to  be  a 
difficult  task.  We  focus  on  bounding  the  optimal  partitioning.  A  survey  on  related  problems  can 
be  found  in  e.g.  [4]. 

An  instance  of  a  graph  partitioning  problem  is  described  by  the  (symmetric)  adjacency  matrix 
A  of  size  n  and  an  integer  vector  m  =  (mi, . . . ,  mj^),  defining  the  specified  sizes  for  the  subsets 
of  the  partition.  The  vector  u  is  a  vector  consisting  of  ones.  Finally  we  denote  by  U)(E)  the 
sum  of  all  edge  weights  of  G,  i.e.  w(E)  =  u‘Au/2,  and  by  w(Etvt)  the  total  weight  of  the  edges 
cut  by  an  optimal  partition.  Moreover  let  w(£«neu«)  =  —  u>(£e««)-  The  following  nonlinear 

optimization  problem  solves  the  graph  partitioning  problem,  see  e.g.  [5]. 

(GP)  w(£ot««)  =  ^trX^AX 

such  that 

X*X  =  diag(m)  (1) 

Xui,  =  u„;  X*Ur,  =  m  (2l 

A"  >  0.  (3) 

The  constraints  guarantee  that  all  entries  of  the  n  x  k  matrix  A'  arc  either  0  or  1.  The  nonzero 
entries  of  column  j  of  X  represent  the  nodes  contained  in  Sj. 

We  will  use  the  model  (GP)  to  obtain  tractable  relaxations  for  graph  partitioning  and  hence 
at  leut  bounds  on  iv(£«neu()- 

Dropping  the  constraints  (2)  and  (3)  leads  to  one  of  the  first  relaxations  for  graph  partitiomng. 
It  was  proposed  by  Donath  and  Hoffman  in  the  1970s  [1].  A^'(Af )  denotes  the  y-largest  eigenvalue 
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of  a  (symmetric)  matrix  M. 

1  1 

w(Eun^)  <  iTMx{-tTX*AX  I  X  satisfies  (1)}  =  r  X]  (^) 

Any  X  containing  pairwise  orthogonal  eigenvectors  xj  corresponding  to  Ay(A)  and  having  the 
correct  length  =  mj  constitutes  a  maximand  in  (4). 

The  Donath-Hofiman  botmd  can  be  further  strengthened  by  dropping  only  the  nonnegativity 
conditions  from  (GP),  see  [5].  In  the  case  where  the  rtij  are  all  equal  (to  n/i),  the  linear  term 
in  the  bound  is  constant.  Assumption:  mi  =  ...  =  mi,  =  n/k.  We  denote  by  Vn  an  arbitrary 
matrix  satisfying  V‘u  =  0,  V*V  =  /„-i.  In  other  words  the  columiu  of  Kn  are  orthonormai  and 
orthogonal  to  u. 

w(Euncui)  <  max{itrJf‘AA'  :  X  satisfies  (1), (2)}  =  ^  (6) 

This  upper  bound  is  attained  for 


where  Z  contains  a  set  of  Jb  - 1  orthonormal  eigenvectors  corresponding  to  the  largest  Xj{V^AVn). 

A  further  improvement  can  be  achieved  along  the  following  lines  (1,  5].  Let  d  €  K"  and  X  be 
an  arbitrary  feasible  partition,  i.e.  X  satisfies  (1,2,3).  Then  it  can  readily  be  seen  that 

tr  A’‘(diag(d)  -  —I)X  =  0. 

ft 

Therefore,  see  [5],  it  can  be  shown  that 

w{E„^)  <  ^s(A)  +  ^  £  XMA  +  diag(d)  -  ^/)V;)  =:  ^s(A)  +  ^/(d). 

i=i 

This  leads  to  the  following  bound  for  partitioning  the  nodes  into  subsets  of  equal  sise,  see  (5]: 

««(£«nou)  <  ^  +  -nin{^/(d) :  d  e  R"}.  (7) 

The  corresponding  maximizer  X  can  be  used  to  generate  "good"  partitions,  by  either  rounding 
or  solving  transportations  problems,  see  (5],  [2]  for  further  details. 

We  conclude  with  some  numerical  results  on  graphs  used  by  Johnson  et  al  [3]  to  test  graph 
bisection  heuristics.  In  the  following  table  n  denotes  the  size  of  the  (tmweighted)  grH>h  of 
cardinality  |£|.  We  further  provide  the  density  in  %.  The  column  labeled  "Upper  Bound” 
contains  the  upper  bound  (7),  using  tools  from  matrix  analysis  to  calculate  the  largest  eigenvalue, 
and  tools  from  nonlinear  optimization  to  carry  out  the  minimisation.  We  consider  bisecting  the 
graph,  i.e.  ib  =  2.  The  column  labeled  "Lower  Bound”  contains  a  bisection  obtained  by  rounding 
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and  liinited  local  iiiq)roveinent.  We  also  provide  the  relative  gap  (in  %)  between  lower  and  upper 
bound,  and  include  the  best  solutions  described  in  [3].  It  is  interesting  to  see  that  the  present 
approach  yields  not  only  tight  bounds,  but  also  produces  at  low  computational  cost  feasible 
solutions  that  are  highly  competitive  with  solutions  obtained  after  extensive  experiments  using 
simulated  annealing  and  the  Kemighan-Lin  heuristic. 


n 

\E\ 

Density  (%) 

Upper  bd 

Lower  bd 

%  gap 

Johnson 

124 

149 

2 

141 

13^ 

3.7 

136 

124 

318 

4 

271 

254 

6.7 

255 

124 

620 

8 

467 

442 

5.6 

442 

124 

1271 

17 

853 

822 

3.8 

822 

250 

331 

1 

316 

301 

6.0 

302 

250 

612 

2 

531 

495 

7.3 

498 

250 

1283 

4 

981 

925 

6.1 

926 

250 

2421 

8 

1675 

1588 

5.5 

1593 

500 

625 

0.5 

600 

573 

4.7 

573 

500 

1223 

1 

1071 

1001 

7.0 

1004 

500 

2355 

2 

1844 

1713 

7.6 

1727 

500 

5120 

4 

3564 

3358 

6.1 

3376 

1000 

1272 

0.25 

1228 

1172 

4.8 

1170 

1000 

2496 

0.5 

2193 

2030 

8.0 

2045 

1000 

5064 

1 

3958 

3676 

7.7 

3697 
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1.  Introduction 

Background  and  description  of  the  problem.  In  satellite  communication,  one  satellite 
can  serve  several  radio  stations  on  earth.  In  order  to  allow  signals  to  be  sent  from  each  radio 
station  to  each  other  radio  station,  the  TDMA  (time  division  multiple  access)  technique 
is  used.  At  any  instant,  the  satellite  is  set  to  a  fixed  switching  mode:  All  radio  stations 
transmit  and  receive  data  simultaneously,  tmd  the  switching  mode  determines  for  each 
radio  station  the  radio  station  which  receives  the  data  which  the  former  transmits.  In 
mathematical  terms,  a  switching  mode  is  a  one-to-one  mapping  on  the  set  of  radio  stations, 
i.  e.,  a  permutation.  The  satellite  time-multiplexes  regularly  between  different  switching 
modes  in  short  intervals,  according  to  a  fixed  cyclic  schedule. 

The  communication  needs  between  the  radio  stations  are  given  by  a  matrix  T  =  (fi>), 
the  traffic  matrix,  tij  is  the  amount  of  information  per  time  unit  that  has  to  be  transmitted 
from  the  i-th  to  the  ;-th  radio  station.  More  information  on  the  technical  background  can 
for  example  be  found  in  Burk2u:d  (1985).  We  consider  the  problem  of  setting  up  a  schedule 
for  the  satellite,  i.  e.,  a  sequence  of  switching  modes  and  a  duration  for  each  switching 
mode.  Formally,  the  mati~ix  decomposition  problem  can  be  stated  £is  follows: 

Given  a  nonnegative  nxn  matrix  T  =  (tij),  find  a  decomposition  of  T,  i.  e.,  a 
sequence  of  permutation  matrices  P*,  P*,  ...,P’  and  a  sequence  of  nonnegative 
weights  such  that 

9 

T  <  ^1/tP*  (elementwise)  .  (1) 

t=i 

9 

The  total  duration  d  of  the  decomposition  is  given  by  d  =  ^  /*. 

k=\ 

The  first  goal  in  setting  up  a  switching  schedule  is  of  course  to  keep  the  total  duration  as 
small  as  possible.  On  the  other  hand,  every  change  of  the  switching  mode  incurs  a  certain 
overhead  and  loss  of  time.  Therefore,  the  number  of  matrices,  q,  should  not  be  too  large. 
There  is  clearly  a  trade-off  between  the  two  objectives,  d  and  q. 

Related  results  and  results  of  the  present  paper.  Inukai  [1979]  and  Burkard  [1985] 
have  shown  that  the  optimal  total  duration  is  equal  to  t*,  the  maximum  row  or  column 
sum  of  the  traffic  matrix,  but  in  general,  a  time-optimal  decomposition  may  require  up  to 
—2n  +  2  matrices,  which  is  too  large  for  practical  purposes.  Burkard  [1985]  has  also 
given  algorithm  which  constructs  such  a  decomposition  in  O(n^)  steps. 


This  work  was  supported  by  the  Ponds  zur  F5rderung  der  wissenschaftlichen  Forschung, 
Project  S32/01. 
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It  is  clear  that  any  decomposition  must  consist  of  at  least  n  matrices,  unless  some  entries 
in  the  traffic  matrix  are  zero.  Gopal  and  Wong  [1985]  and  Rendl  [1985]  have  shown  that  the 
problem  of  constructing  a  shortest  decomposition  into  at  most  n  matrices  is  NP-complete. 
Thus,  it  makes  sense  to  look  for  heuristics.  The  currently  best  heuristic  for  decomposing 
into  n  matrices  is  due  to  Balas  and  Landweer  [1983].  A  more  extensive  review  of  results 
concerning  the  matrix  decomposition  problem  can  be  found  in  Burkard  (1991). 

We  propose  a  simple  and  fast  “scaling”  heuristic  for  constructing  a  short  schedule  with 
a  given  upper  boimd  Q  on  the  number  q  of  switching  modes.  We  can  prove  a  relative  error 
guarantee  for  the  total  duration  d  of  the  decomposition.  The  method  is  not  applicable  if 
Q  =  n  or  Q  exceeds  n  only  slightly.  When  Q  is  somewhat  larger  than  n  (of  the  order  2n  or 
3n),  the  error  bound  is  still  very  crude,  but  it  improves  as  the  ratio  of  Q  and  n  increases. 

As  a  subproblem,  we  address  the  problem  of  decomposing  a  matrix  imder  the  constraint 
that  the  lower  bound  t*  on  the  total  duration  has  to  be  achieved;  the  number  q  of  matrices 
remains  as  the  objective  function  to  be  minimized.  The  traffic  matrices  that  we  have  in 
mind  for  this  problem  are  matrices  with  small  integer  entries.  Here  we  use  two  heuristics: 
one  based  on  a  bottleneck  assignment  problem  and  on  matching  techniques,  and  a  more 
powerful  one  which  solves  maximum  flow  problems. 

As  a  side  issue,  we  mention  that  one  other  subproblem  that  we  have  to  solve  has  some 
connections  with  voting  systems. 

Finally  we  present  the  results  of  numerical  experiments  measuring  the  actual  behavior 
of  our  heuristics  for  some  randomly  generated  problems.  We  compare  our  algorithm  to 
the  heuristic  of  Balas  and  Landweer  [1983]  for  decomposing  into  only  n  matrices.  The 
experiments  show  that  the  heuristics  performs  very  well,  and  that  the  method  might  be  of 
practical  interest. 

A  full  version  of  this  abstract  is  available  as  a  technical  report.  Rote  and  Vogel  [1990]. 
2.  The  heuristics 

Our  algorithm  is  based  on  the  simple  idea  of  scaling  the  entries  of  the  given  traffic  ma¬ 
trix  and  rounding  them  to  small  integers.  A  matrix  with  small  integers  will  require  a 
small  number  q  of  matrices  for  decomposition;  theoretically,  we  will  utilize  the  trivial  up¬ 
per  bound  f*  on  the  number  q  of  required  permutation  matrices.  (Rec2dl  that  f*  is  the 
maximum  row  or  column  sum  of  the  traffic  matrix.) 

Globally,  the  algorithm  runs  as  follows: 

Input:  A  non-negative  real  nxn  matrix  T. 

(a)  Choose  some  “unit”  F  >  0. 

(b)  Round  the  entries  of  the  matrix  upwards  to  the  next  multiple  of  F: 


(c)  Solve  the  matrix  decomposition  problem  for  the  resulting  matrix  T  (or 
equivalently,  for  the  integer  matrix  (u,;)  :=  ([ti>/^l)  obtained  by  di¬ 
viding  through  F). 

(d)  The  resulting  decomposition  can  be  adjusted  downwards  to  compensate 
for  the  rounding  up  in  step  (b). 
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The  quality  of  the  solution  produced  depends  first  of  all  on  the  choice  of  F  in  step  (a). 
The  idea  is  to  choose  F  so  large  that  the  matrix  (u,y)  consists  of  small  integers  and  only 
few  permutation  matrices  are  needed  for  its  duration-optimal  decomposition  in  step  (c), 
and  so  small  that  the  error  incurred  in  the  roimding  in  step  (b)  is  not  too  large.  By 
choosing  F  appropriately,  we  will  be  able  to  give  a  performance  guarantee  for  the  quality 
of  the  solution  produced  by  the  heuristic. 

Step  (c)  is  the  heart  of  the  algorithm.  The  principal  goal  of  this  step,  a  duration- 
optimal  decomposition,  is  relatively  easy  to  achieve,  but  we  also  want  few  matrices,  since 
their  number  will  be  the  number  of  matrices  that  the  solution  will  have.  We  shall  discuss 
three  methods  for  carrying  out  this  step. 

Method  I  —  simple  and  fast:  edge  coloring 

Lemma  1.  An  integer  nxn  matrix  U  with  maximum  row  and  column  sum  u*  can  be 
decomposed  into  q  =  u*  permutation  matrices. 


Proof:  If  we  restrict  the  problem  to  decomposition  into  “unit”  permutation  matrices,  i.  e., 
we  allow  only  weights  U  =  1  in  (1),  then  we  get  essentially  an  edge  coloring  problem  for  a 
bipartite  multigraph  with  vertices  r^  and  (t  =  1, . . .  ,n),  with  Uij  parallel  edges  between 
r,  and  Cj.  In  a  bipartite  ( multi- )graph,  the  number  of  colors  required  (the  chromatic  index) 
equals  the  maximum  degree,  which  is  equal  to  «*  in  our  case. 

The  currently  fastest  edge  coloring  algorithm  is  the  one  of  Cole  and  Hopcroft  [1982], 
which  leads  to  a  time  complexity  of  0(u*n  log  n)  for  carrying  out  the  decomposition.  The 
bound  9  <  u*  of  lemma  1  is  tight  if  and  only  if  u*  <  (n  +  1)^/4  (see  the  appendix  of  Rote 
and  Vogel  (1990]). 

The  next  lemma,  which  relates  F  and  the  quality  of  the  solution,  czui  be  proved  by  quite 
straightforward  calculations. 


Lemma  2.  If  F  is  chosen  as  the  smallest  value  such  that  u‘  <  Q,  for  some  given  value 
Q  >n,  then  the  following  relation  holds  between  the  maximum  row  and  column  sum  Fu* 
of  the  rounded-up  matrix  and  the  corresponding  value  t*  of  the  original  matrix: 

Q 


Fu*  < 


Q-n-l-l 


■f. 


By  simply  choosing  F  in  our  algorithm  as  the  smallest  value  which  gives  u’  <  (?,  the 
preceding  two  lemmas  yield  the  following  theorem: 


Theorem.  An  nxn  matrix  T  can  he  decomposed  into  a  weighted  sum  of  no  more  than  Q 
permutation  matrices  (Q  >  n)  with  a  total  duration  that  is  within  a  factor  of  Q/{Q-n  +  l) 
of  the  value  t*  that  is  obtainable  without  restriction  on  the  number  of  matrices  in  the 
decomposition. 


For  a  positive  matrix  U  the  analysis  of  lemma  1  can  be  refined  by  the  maximiun  flow 
techniques  discussed  below  to  yield  a  bound  of  q  <  f(u*  -\-  n)/2].  This  allows  us  to 
improve  the  bound  Q/(Q  -  n  +  1)  in  the  above  theorem  to  {Q  -  n/2)/{Q  -  n  +  1/2)  (for 
arbitrary  nonnegative  matrices). 

To  actually  determine  F,  we  look  at  each  row  and  column  individually,  find  the  small¬ 
est  F  such  that  the  siun  of  \tijlF\  in  this  row  or  column  is  at  most  Q,  and  take  the 
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mavimiim  of  all  those  F’s.  The  probjem  of  computing  for  the,  say,  first  row  of  T,  the 
smallest  F  such  that  5Z>=i  ^  Q  occurs  also  in  proportional  voting  systems  and  the 

theory  of  apportionment  (cf.  e.  g.  Woodall  (1982)  or  Balinski  and  Young  [1983]).  In  this 
case,  the  number  F  is  the  quota.  Algorithmically,  it  can  be  determined  as  the  (Q-n+1)- 
largest  element  in  the  (infinite)  array  (tii//)i<i<n,t>i-  This  array  has  sorted  rows.  We 
remove  from  each  row  j  of  the  array  the  first  [(Q  —  elements.  In  ih:s 

way  we  remove  between  Q  -2n  and  Q  —  n  elements,  and  F  is  the  l:-largest  element  in  the 
remaining  array,  where  k  is  some  number  is  between  1  and  n.  This  element  can  be  ioudg 
in  0(n)  time  by  the  method  of  Frederickson  and  Johnson  [1982].  There  are  also  siinpie 
and  practical  methods  which  use  priority  queues  and  talce  C'i'nlogn)  time.  !n  any  case 
the  complexity  of  the  algorithm  is  dominated  by  step  (c),  anc.  it  is  0(Qn  lognj. 

Method  II  —  greedy:  bottleneck  assignment 

There  is  no  way  to  use  the  edge  coloring  for  a  decomposition  into  fewer  ttisin  ■<*  matrices. 
In  practice,  we  would  like  to  have  an  zdgorithm  which  uses  as  fev/  matiiccs  as  possible 
because  this  would  allow  us  to  choose  F  smzJler,  and  we  would  lose  less  in  the  rounding-uo 
of  step  (b). 

We  try  to  reduce  q  by  the  following  greedy  strategy:  We  select  the  first  weighted  permu¬ 
tation  matrix  /jP*  in  such  a  way  that  the  maximum  row  and  column  sum  of  the  remaining 
traffic  matrix  max{U  —  /iP*,0)  is  reduced  by  as  much  as  possible.  This  will  reduce  the 
bound  u*  on  the  number  (and.  hopefully,  also  the  actual  number)  of  further  matrices  which 
will  be  needed  in  the  decomposition.  We  continue  this  strategy  with  the  remaining  matrix 
until  we  are  done. 

By  analyzing  the  condition  that  the  maximum  row  and  column  sum  of  U  must  be  reaucen 
by  li  when  the  matrix  min{liP^ ,U)  is  subtracted,  we  get  the  following  condition  on  P'- 

If  =  1  then  li  <  Uij  -t-(u’  -  max{r,,c,}).  :2) 

Let  us  interpret  this  formula.  For  critical  rows  (and  columns),  i.  e..  rows  with  r;  =  .i  , 
/]  must  be  <  Uij.  Non-critical  rows  and  columns  have  some  stack  n*  —  r.  or  u*  — 
respectively,  which  allows  to  weaken  this  inequality  for  their  elements:  The  smaller  o.f  the 
row  slack  and  the  column  slack  for  each  element  can  be  added  to  the  bound  Uj,-. 

Finding  the  maximum  value  of  li  for  which  such  a  permutation  matrix  P*  exists  is  a 
bottleneck  assignment  problem  whose  cost  matrix  is  given  by  the  right  side  of  (2).  A  fter 
determining  Pi  and  li,  min(/iP‘,f/)  is  subtracted  from  [/.  P*  and  I2  are  determined  by 
the  same  procedure  for  the  remaining  matrix,  and  so  on. 

The  sequence  (Ii ,  Ij, . . . ,  is  weakly  decreasing  and  consists  of  small  numbers.  There¬ 
fore  we  solve  the  bottleneck  assignment  problem  by  testing  for  successive  values  of  1*  in 
decreasing  order.  Each  test  amounts  to  finding  a  complete  bipartite  matching.  For  this 
purpose,  we  can  use  the  procedure  of  Hoperoft  and  Karp  [1973],  which  requires  0(n*/*) 
steps,  and  thus  method  II  can  be  carried  out  in  0((q  +  time. 

In  our  computational  experiments  we  have  additionally  reduced  Ik  in  each  step,  if  nec¬ 
essary,  in  order  to  ensure  that  the  graph  defined  by  (2)  has  no  isolated  vertices.  This  wss 
sufficient  to  eliminate  most  of  the  unsuccessful  tests. 
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Method  III  —  even  greedier:  maximum  network  flow 

By  solving  a  maximiim  flow  problem  on  a  suitably  defined  graph,  we  can  try  to  determine 
all  permutation  matrices  P*  y;hich  belong  to  a  group  of  equal  It's  simultaneously.  Our 
maximum  flow  problem  is  an  extension  of  the  matching  problem  of  method  II. 

We  set  up  a  network  which  has  two  nodes  for  each  row  t:  A  regular  source  node  r,  and 
A  “slack”  node  fj.  Similarly,  there  is  a  sink  node  Cj  and  a  colimm  slack  node  Cj  for  each 
column  j.  For  each  entry  u<y  we  have  now  two  arcs:  There  is  a  “direct”  arc  from  rj  to 
Cj  of  capacity  [u,y//ibj*  This  capacity  coimts  how  often  a  permutation  may  use  the  entry 
Uij  by  reducing  row  and  colunm  sums  and  Cj.  In  addition,  there  is  a  “pseudo-arc”  from 

to  Cj  of  infinite  capacity.  Using  this  arc  corresponds  to  reducing  the  slacks  u*  —  rj  and 
u*  —  Cj  by  Ij.  This  usage  is  restricted  by  the  capacities  of  the  “slack”  arcs  from  r,  to  f, 
of  capacity  [(u*  —  and  from  Cj  to  Cj  of  capacity  [(u*  -  cy)//*],  which  precede  and 

follow  the  pseudo-arc.  However,  when  u*  —  r,  <  /*,  we  set  the  capacity  of  the  slack  arc 
to  1  instead  of  0,  but  at  the  same  time  we  eliminate  all  arcs  out  of  fj  for  which 
(t-iy  mod  Ik)  +  u*  —  r;  <  Ij.  We  do  the  same  for  all  columns. 

This  last  modification  ensures  that,  if  an  entry  u/j  should  be  usable  by  criterion  (2),  then 
there  is  a  way  to  send  at  least  one  unit  of  flow  from  ri  to  Cj.  The  remainder  (u,’j  mod  U), 
which  cannot  be  ‘Used  up”  by  the  flow  on  the  direct  edge  (ri,Cj)  of  capacity  is 

put  together  v/ith  the  slack  u*— rj  to  see  whether  a  total  of  Ik  can  be  reached. 

V/e  place  a  constant  supply  and  demand  of  value  g  at  each  source  vertex  and  at 
each  sink  vertex  Cj,  respectively.  The  largest  value  of  g  such  that  a  flow  satisfying  all 
these  supplies  and  demands  exists  is  the  number  of  permutation  matrices  of  weight  Ik 
into  which  we  can  decompose  the  matrix.  To  find  those  g  matrices,  we  decompose  the 
flow  into  “permutation  flows”.  This  can  be  done  by  the  coloring  algorithms  of  Cole  and 
Hoperoft  [1982],  which  we  used  in  method  I.  If  y  >  0,  we  have  to  try  the  same  value  of 
Ik  again,  since  we  may  not  have  exhausted  all  possible  matrices  with  weight  If  If  got 
^  =  0,  we  decrease  Ik  by  one  and  try  again. 

3.  Computational  results 

We  have  programmed  methods  II  and  III  and  applied  them  to  randomly  generated  test 
problems  of  various  sizes  in  the  range  from  n  =  10  to  n  =:  100.  For  the  various  sub- 
procedures,  we  used  the  simplest  possible  implementations  that  did  not  take  too  much 
time. 

We  have  tried  to  get  approximately  q  w  2n  matrices.  Some  target  value  M  for  u* ,  which 
was  determined  experimentally,  was  an  input  parameter  of  the  program,  and  we  computed 
F  by  the  formxila  F  =  1/2  •  —  n  -1- 1)  -t-  We  computed  u*  with  this  value  of 

F,  and  then  we  reduced  F  as  much  as  possible  while  still  keeping  the  same  u*.  With  this 
procedure,  the  desired  number  of  q  =  2n  matrices  was  achieved  reasonably  precisely.  For 
step  (d),  we  succesrively  lowered  each  fy*  for  k  from  1  to  q,  as  much  as  possible  while  still 
maintainung  the  relation 

The  results  are  shown  in  the  following  table.  The  numbers  are  averages  of  100  matrices 
each,  with  random  integer  entries  in  the  range  1-100  for  method  II  (and  also  in  the  last 
column),  and  in  the  range  1-1000  for  method  III.  CPU-times  are  given  in  seconds  on  a 
DEC  VAX  11/785  computer.  The  measure  of  the  solution  quality,  the  total  duration,  is 
normalized  in  terms  of  the  relative  excess  over  the  lower  bound  t*,  in  order  to  make  the 
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results  comparable  for  different  types  of  matrices.  For  contrast,  the  last  column  contains 
the  results  of  the  heuristic  of  Balas  and  Landweer  [1983]  for  decomposing  into  only  q  =  n 
matrices.  Of  course,  this  is  not  a  fair  comparison  because  their  heuristic  selves  a  mere 
restricted  problem.  Still,  one  can  see  how  much  may  be  gained  in  total  duration  by  allowing 
more  than  n  matrices,  in  particular  for  smaller  n. 


n 

Method  II,  q  Ki2n,  entries  1-100 

Method  III,  q  fs  2n,  entries  1-1000 

B&L,  9  =  n  i 

M 

CPU-time 

{d-n/f 

M 

CPU-time 

id -nit* 

id-n/t’ 

10 

HI 

0.20 

350 

0.87 

0.598% 

7.84% 

E3 

HI 

1.16 

720 

4.49 

0.993% 

6.13%  ; 

30 

700 

3.40 

1.648% 

1200 

12.68 

1.014% 

5.47% 

40 

7.57 

1.596% 

27.49 

0.962%  j 

4.09%  ; 

50 

14.07 

1.504% 

51.31 

0.884%  I 

2.53%  ! 

23.51 

1.460% 

84.62 

0.847%  ' 

2.36% 

80 

53.09 

1.375% 

5000 

193.25 

mSBSm 

2.62% 

100 

100.57 

1.121% 

7000 

361.22 

■HH 

1.78% 
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1  Introduction 

Simulated  annealing  (SA)  [17]  is  a  widely  used  method  for  combinatorial  optimization 
problems.  If  this  method  is  designed  for  optimization  over  continuous  variables,  i.e. 
min{/(x)  |i  6  M  C  R"},  a  close  relationship  between  Simulated  Annealing  and  so- 
called  Evolution  Strategies  (ES)  [21][24][25]  can  be  noticed.  This  is  why  several  results 
for  ES  algorithms  can  be  used  to  design  efficient  parallel  SA  algorithms. 

The  study  of  convergence  to  the  global  minimum  of  SA  has  mainly  concentrated  on  the 
case  of  a  finite  or  countable  state  space  (see  e.g.  the  review  of  Romeo  and  Sangiovelli- 
Vincentelli  [22]).  For  continuous  state  spaces  there  are  results  in  form  of  stochastic 
differential  equations  [1][10][11],  whereas  a  proof  of  original  SA  is  given  by  Haario  and 
Salesman  [13]  for  general  state  spaces. 

In  this  paper  the  differences  and  similarities  of  SA  amd  ES  algorithms  and  their  im¬ 
plications  for  convergence  results  are  investigated.  It  turns  out  that  it  is  necessary  to 
sdapt  the  sampling  distributions  over  time  to  achieve  a  reasonable  convergence  rate 
and  it  is  shown  by  example  that  the  gun  of  multiple  trials  from  a  single  point  is  low. 
Therefore  the  algorithm  is  modified  so  that  it  can  be  executed  on  a  SIMD  parallel  com- 


509 


puter.  This  algorithm  is  much  more  reliable  than  multiple  trial  SA  which  is  supported 
by  some  preliminary  test  results. 

2  Markovian  Optimization  Algorithms 

Sequential  variants  of  SA  and  ES  algorithms  can  be  studied  in  the  general  framework 
of  markovian  processes.  The  general  algorithmic  frame  can  be  formulated  as: 

choose  Xo  €  A/  C  IR**  and  set  t  =  0 
repeat 

=Xt  +  z, 

Xt+i  =  Yt+i  ■  a{Xt,  V|+i; .)  +  Af  •  (1  —  a(A(,  V<+i; .)) 

increment  t 

until  termination  criterion  applies 

where  a(z,y;.)  denotes  the  acceptance  function  which  may  depend  on  additional  pa¬ 
rameters.  The  distribution  of  random  vector  Zt  is  chosen  to  be  symmetric,  i.e.  z  =  Bz 
for  every  orthogonal  matrix  B.  In  this  case  z  may  be  expressed  in  its  stochastic  rep¬ 
resentation  z  =  ru,  where  r  is  a  nonnegetive  random  variable  and  u  a  random  vector 
uniformly  distributed  on  a  hypersphere  surface  of  dimension  n  (see  e.g.  [9]).  This  re¬ 
veals  that  the  trial  point  generation  mechanism  of  the  above  algorithm  is  equivalent  to 
that  of  a  random  direction  method  with  step  size  distribution  r  (see  [19][20]). 
Depending  on  the  choice  of  the  acceptance  function  a(z,y;.)  and  of  the  generating 
distribution  of  z  one  gets  a  family  of  markovian  optimization  algorithms  which  can  be 
identified  by  a  sequence  of  transition  probabilities  {Pt)t^u: 

Pt{x,A)  =  J Qt{x,du)q,ix,u>)du>+lAix)  J Q,{x,cL/){l  -  qtix,Lj))<L}  (1) 

A  M 

with  A  C  M ,  X  €  M  and  where  Q  denotes  the  generating  distribution,  the  charac¬ 
teristic  function  of  set  A  and  q%  the  acceptance  probability  function  which  is  related  to 
the  acceptance  function  0|  via 

®<(*T  y;  0  ~  1  (^) 


where  ^  is  a  random  variable  uniformly  distributed  on  [0, 1].  Typical  examples  are: 
9(x,y;.)  =  lR.(/(x)-/(y)-l-r)  (3) 
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(4) 

(5) 


=  !«♦(/(*)-/(»)) 

where  (3)  is  used  by  threshold  accepting  methods  proposed  by  Dueck  and  Scheuer  [8] 
for  combinatorial  problems  and  tested  by  Bertocchi  and  Di  Ordoardo  [2]  for  continuous 
variables,  whereas  (4)  is  applied  by  evolution  strategies  and  (5)  by  simulated  annealing 

Usually,  the  sampling  distribution  is  chosen  to  be  a  uniform  distribution  on  bounded 
regions,  e.g.  fixed  [31][16][30]  or  adapted  bypercubes  (29][14][6]  and  fixed  [4]  or  adapted 
hypersphere  surfaces  (3].  Since  it  is  not  possible  with  those  distributions  to  reach  each 
state  in  M  when  being  trapped  in  a  local  minimum  one  has  to  provide  the  algorithm 
with  the  chance  to  perform  some  steps  with  worse  objective  function  values  to  escape 
from  local  minima.  This  is  realized  by  using  (5)  in  (2).  However,  in  order  to  establish 
convergence  at  all  the  probability  of  accepting  a  worse  point  has  to  be  decreased  to  zero 
over  time.  The  result  of  Haario  and  Saksmam  [13]  indicates  that  the  rate  of  decrease 
has  to  be  logarithmic  as  in  the  finite  case. 

Using  this  cooling  schedule  the  rate  of  convergence  is  rather  slow.  Hence,  other  sched¬ 
ules  are  used  in  practical  applications  (see  table  1)  which  serve  with  faster  but  possibly 
nonglobal  convergence.  This  problem  can  be  circumvented  by  an  appropiate  choice  of 
the  generating  distribution.  Indeed,  if  M  is  bounded  one  might  use  the  uniform  distri¬ 
bution  over  M  and  global  convergence  for  continuous  functions  follows  from  standard 
arguments  (see  e.g.  (7))  with  Tt  =  0.  Szu  and  Hartley  [27]  claim  that  global  convergence 
can  be  established  by  employing  a  multidimensional  Cauchy  distribution  with  density 
g{x)  =  K„  rr‘  (7?  -f  which  concentrates  around  0  caused  by  the  sched¬ 

ule  Tt  =  To/{t  +  1).  The  advantage  opposed  to  sampling  distributions  with  bounded 
support  is  due  to  the  fact  that  for  each  trial  there  exists  a  (small)  probability  to  reach 
any  state.  Actually,  under  some  conditions  no  cooling  is  necessary  at  all  such  that  (5) 
becomes  equivalent  to  (4)  and  global  convergence  can  be  guaranteed. 

Theorem  l  (see  e.g.  [26][18]) 

Let  /*  >  — oo  and  for  the  Lebesgue  measure  of  the  level  sets  p{Lj»+t)  >  0  for  all  e  >  0. 
If 

f,Q{XuLf,^.)  =  oo  VoO  (6) 

(■0 

then  f{Xt)  -*  /*  with  probability  one.  □ 
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cooling  type 

schedule 

references 

geometrical 

subtractive 

linear 

function  value 

7*1  =  c*  To  with  c  €  (0, 1) 

r,  =  max{0,7o-<Ar} 

r.  =  ro/(<-n) 

T,  =  af(Xt)  +  0 

[29][31][30][6][3] 

[14] 

[27][28] 

[4] 

Table  1:  Typical  cooling  schedules  used  in  practical  applications 


For  instance,  let  Q{Yt^i  —  Xt)  be  multinormally  distributed  with  zero  mean  and  covari¬ 
ance  matrix  Ct  =  <t?  /,  where  min{<rt  1 1  >  0}  >  a  >  0  and  /(x)  -♦  oo  for  ||i||  — »  oo. 
Then,  the  lower  level  sets  are  bounded  and  there  exists  a  minimum  positive  probability 
to  hit  the  level  set  L/-+<.  Thus,  liminf  Q(X«,L/.+,)  >  0  and  the  sum  in  (6)  diverges. 
Although  global  convergence  of  the  above  type  should  be  the  minimum  requirement 
of  a  probabilistic  algorithm  it  is  more  interesting  to  know  something  about  the  finite 
time  behavior,  i.e.  the  rate  of  convergence.  For  finite  state  space  it  is  known  that  the 
convergence  rate  of  the  probabUUy  to  reach  the  optimal  state  is  of  order  1  —  with 

a  >  0  depending  on  the  problem  (sec  Chiang  and  Chow  [5]).  This  is  slow  convergence 
since  the  rate  of  convergence  of  pure  random  search  is  of  order  1  —0(0*)  with  0  €  (0, 1). 
Another  measure  of  convergence  rate  is  the  expected  error  defined  by  6t  :=  E[/(A'»)— /*]. 
It  can  be  shown  that  =  0(t“®)  for  fixed  sampling  distributions  even  for  strongly 
convex  functions  (19].  However,  if  the  distribution  of  Z,  =  riu  is  adapted  via  rt  = 
||V/(zj)||r,  where  r  has  nonvoid  support  on  (0, s),  then  one  gets  =  0(0*)  with 
0  €  (0, 1)  for  objective  functions  with  “sufficiently  spherical”  level  sets  close  to  the 
minimizer,  e.g.  strongly  convex  functions  (19][20j.  It  can  be  shown  that  a  success/failure 
control  as  proposed  in  [21]  can  be  used,  too.  If  it  would  be  possible  to  adapt  and 
concentrate  the  sampling  distribution  to  the  lower  level  sets  geometrical  convergence  of 
8i  can  be  shown  even  for  lipschitz-continuous  function  with  several  local  minima  [32]. 

3  Parallel  Simulated  Annealing 

Markovian  algorithms  considered  so  far  are  not  well  suited  for  parallelization.  For 
finite  state  space  variants  some  proposals  are  surveyed  in  [12].  A  straightforward  way 
to  take  advantage  of  parallel  hardware  is  to  perform,  say  p,  trials  in  parallel  on  p 
processors  and  to  select  the  best  move.  This  is  the  idea  of  so-called  (l,p)-Evolution 
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Strategies  [25]  and  it  can  be  used  for  SA  as  well  [3].  However,  a  simple  example 
reveals  that  the  speedup  is  poor  even  for  convex  functions:  Let  /(x)  =  ||z||^  an<l 
M  =  IR".  Then  the  convergence  rate  of  the  sequential  version  using  criterion  (4)  can 
be  estimated  to  be  6t  =  (1  — p.405/n)‘  (/(xq)  —  /*)  for  large  n,  whereas  for  the  (l,p)- 
ES  holds  S3  (1  —  21og(p)/n)‘ (/(xo)  —  /*).  It  follows  that  the  expected  speedup  is 
E[S,]  =  0(log  p). 

Another  straightforward  parallelization  scheme  is  to  run  the  sequential  algorithm  on 
p  processors  independently  [3]  as  a  parallel  version  of  the  well-known  multistart  tech¬ 
nique.  This  is  well  suited  for  SIMD  parallel  computers  which  perform  the  same  in¬ 
struction  on  p  processors  in  parallel  but  on  different  data  streams  as  well  as  for  MIMD 
parallel  computers  which  can  simulate  SIMD  programs  (23). 

The  general  idea  of  Evolutionary  Algorithms  is  to  view  the  trial  vector  as  the  genome 
of  an  individual  that  is  mutated  by  the  sampling  distribution.  Choosing  the  better  trial 
point  for  the  next  iteration/generation  can  be  regarded  as  selection  .  It  can  be  shown 
that  simply  placing  one  individual  on  each  processor  (the  processors  are  arranged  in  a 
torus  topology)  and  performing  selection  among  the  nearest  neighbors  is  less  reliable 
than  parallel  multistart.  However,  introducing  a  recombination  mechanism  provides 
the  parallel  algorithm  with  a  new  quality:  Before  reproducing  a  new  trial  point,  another 
individual  is  selected  from  the  neighborhood  and  the  genomes  are  merged.  With  this 
mechanism  it  is  possible  to  push  individuals  out  of  local  minima.  In  biological  terms, 
the  population  becomes  or  keeps  more  diversity  increasing  the  chance  to  find  the  global 
minimum.  Preliminarily  test  results  support  this  hypothesis. 
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1  Introduction 

Duality  is  routinely  used  in  conjunction  with  simplex-based  algorithms  for  lin¬ 
ear  programming.  In  conjunction  with  interior-point  methods,  the  same  can 
hardly  be  said.  Although  it  has  been  used  in  the  design  of  many  interior-point 
algorithms[l],[5],[6],  its  role  in  postoptimality  analysis  and  structure  exploita¬ 
tion  when  these  algorithms  are  used,  is  not  fully  investigated. 

The  problem  of  postoptimality  analysis  is  central  to  optimisation  in  general 
and  linear  programming  in  particular.  It  relies  heavily  on  the  duality  concept 
without  which  linear  programming  would  not  be  such  a  powerful  decision 
making  tool.  It  is  concerned  with  the  stability  of  the  solution  set  of  a  given 
LP  problem  when  perturbations  occur  in  the  input  data.  Here,  emphasis  will 
be  on  discrete  changes  only. 

Structure  is  also  an  important  aspect  of  linear  programming.  A  lot  of 
effort  has  gone  in  designing  algorithms  which  take  advantage  of  structure  in 
general  linear  programmes  but  without  much  success  [2].  These  algorithms  are 
almost  exclusively  simplex- based.  Looking  into  ways  of  exploiting  structure 
when  interior-point  methods  are  used  is  therefore  an  interesting  challenge. 
The  concern  here  will  be  with  common  structures  such  as  block-angular  and 
staircase. 


To  deal  with  these  questions  a  variant  of  Karmarkar’s  algorithm  which 
generates  dual  variables  will  be  used. 

The  question  of  implementing  this  variant  will  be  addressed  in  Section  2. 
In  Sections  3  and  4  ways  of  extending  it  to  deal  with  postoptimality  analysis 
and  structure  respectively  will  be  investigated.  Section  5  will  discuss  compu¬ 
tational  results  and  conclusion. 

2  A  dual  variant  of  Karmarkar’s  Algorithm 

The  dual  variamt  of  Karmarkar’s  algorithm  considered  in  the  present  work  is 
based  on  an  algorithm  that  can  be  found  in  one  form  or  another  in  [1],[5],[6]. 
It  handles  LP  problems  in  standard  form  with  bounded  and  non-empty  feasi¬ 
ble  regions.  No  assumption  is  made  about  degeneracy  and  the  algorithm  has 
polynomial  complexity.  An  important  feature  of  the  algorithm  is  the  way  in 
which  improved  lower  bounds  on  the  objective  function  value  are  found.  The 
set  from  which  these  bounds  can  be  chosen  will  be  explicitly  given.  Some  com¬ 
parative  results  concerning  robustness  between  this  algorithm  and  a  standard 
simplex  routine  on  Hilbert  type  LP  problems  will  be  reported. 

3  Postoptimality  Analysis 

Postoptimality  analysis  considers  variations  in  the  coefficients  of  the  problem 
with  a  view  to  assessing  how  sensitive  to  these  variations  is  the  optimum 
solution  which  has  been  obtained.  When  simplex-based  algorithms  are  used 
the  process  of  postoptimality  analysis  can  be  conducted  without  difficulty. 
The  crucial  question  is  whether  this  process  in  the  case  of  Karmarkar-type 
algorithms  is  of  similar  difficulty  and  can  be  done  with  similar  efficiency.  It  can 
already  be  said  that  given  the  way  in  which  optimality  is  checked  in  interior- 
point  methods,  postoptimality  is  never  going  to  be  easy  and  cost  effective. 
In  fact  our  limited  investigation  points  to  the  use  of  standard  results  such  as 
the  complementary  slackness  conditions,  which  are  independant  of  the  method 
used  to  solve  the  LP  problem,  to  conduct  postoptimality  analysis. 
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4  Structure  Exploitation 

In  [4]  an  extension  of  the  algorithm  mentioned  earlier  to  structured  LP  (block- 
angular  and  staircase)  has  been  given.  In  that  extension,  structure  exploitation 
was  at  the  level  of  the  computation  of  the  dual  solutions,  which  is  the  most 
expensive  step  of  the  algorithm.  It  relies  on  the  updating  algorithm  for  least 
squares  of  Heath  [3]. 

The  way  staircase  structured  problems  are  handled  is  by  means  of  problem 
manipulation  so  that  a  block-angular  structure  is  arrived  at.  However,  the 
resulting  block-angular  problem  has  a  linking  block  with  many  columns.  The 
updating  algorithm,  which  includes  the  solution  of  a  square  system  of  linear 
equations  of  the  form  (7  -t-  FF^)u  =  r  where  F  =  {FiF^—Fi)  and  Fi  = 
A  being  the  linking  block,  would,  therefore,  not  be  effective.  However, 
the  linking  block  is  also  structured  and  the  structure  is  carried  through  to 
the  system  of  linear  equations,  (see  diagrams  below);  partitioning  it  to  take 
advantage  of  its  structure  appears  to  be  an  attractive  way  of  improving  the 
updating  step. 

5  Computational  Results  and  Conclusion 

Experiments  with  well  known  difficult  LP  problems  show  that  the  dual  al¬ 
gorithm  considered  here  is  robust.  The  extended  version  has  been  tested 
on  block-angular  problems.  It  leads  to  good  speed-ups  as  the  results  show. 
Postoptimality  analysis  is  made  possible  as  dual  solutions  are  available.  How¬ 
ever,  it  does  not  seem  to  be  easily  conducted  via  the  interior-point  method 
considered.  Much  work  needs  to  done  before  any  conclusion  can  be  reached. 


519 


Partitioning  of  the  Linking  Block  of  the  Matrix 
Derived  from  a  5-Stagc  Staircase  LP  Problem 


Structures  of  Ai*^  and  Pi- 
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MARKOV  DECISION  PROCESSES  WITH  RESTRICTED 
OBSERVATIONS:  FINITE  HORIZON  MODEL 

by 

Yasemin  Serin  and  Zeynep  Miige  Av$ar 
Department  of  Industrial  Engineering 
Middle  East  Technical  University 

In  this  report,  we  develop  an  algorithm  to  compute  optimal 
policies  for  Markov  Decision  Processes  subject  to  the 
constraints  that  result  from  some  observability  restrictions 
on  the  process.  We  assume  that  the  state  of  the  Markov 
process  under  consideration  is  unobservable,  but  there  is  an 
observable  process  related  to  the  original  one.  So,  we  want 
to  find  a  decision  rule  depending  on  this  observable  process 
only.  The  objective  is  to  minimize  the  total  expected 
discounted  cost  over  a  finite  horizon. 

Restricting  the  policies,  as  explained  above,  results  in  a 
nonlinear  programming  model.  The  solution  procedure  for 
this  nonlinear  problem  is  a  method  of  feasible  directions 
(Bazaraa  and  Shetty(1979),  Luenberger(1973))  that  uses 
special  structure  of  the  problem.  On  the  other  hand,  it  is  a 
policy  iteration  method  that  iterates  between  feasible 
policies. 

1.  PROBLEM  DEFINITION:  Consider  a  Markov  Decision 
Process  (MDP)  {(Xt,  At):  t=l,  T  )  where  Xt  is  the  state  of  the 
system  and  At  is  the  action  chosen  in  period  t,  t=l,  ...,  T.  We 
use  period  t  or  epoch  t  to  mean  there  are  t  periods  to  go  until 
the  end  of  the  planning  horizon.  Let  S={1,  2,  ...  ,  N)  be  the 
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State  set  and  A={1,  2,  ....  M)  be  the  action  set.  The  transition 
probability  law  of  the  MDP  is  Pij(a)=P{Xt.i=jl  Xt=i,  At=a)  for  all 
i,  jcS,  asA.  Let  P(a)  be  the  transition  matrix  under  the  action 
aeA  and  C(Xt,At)  b«  the  cost  incurred  at  time  t  with  expected 
value  Cia=E(C(Xt,At)l  Xt=i,  At=a).  We  assume  that  cj,  for  all 
iES,  a£A,  and  P(a)  for  all  a£A  are  known.  A  nonstationary 

Markovian  policy  can  be  described  by  aiat=P(At=al  Xt=i)  for 
all  ieS,  acA,  t=l, T:  a£RNMT. 

Let  S  =  {Si,  S2,  ....  Sk  )  be  a  given  partition  of  the  state  space. 
Suppose,  at  decision  epoch  t,  the  state  of  the  process  Xt  can 
not  be  observed,  but  only  the  subset,  say  S|c,  that  X^  belongs 
to  is  known.  So,  a  practical  decision  rule  is  defined 
depending  on  only  Sk  at  period  t,  rather  than  Xt,  and  the 
same  decision  is  used  for  every  state  in  Sk-  We  define  a 
random  variable  Zt  as  Zi=k  if  and  only  if  Xt£Sk  and  call  Zt 

the  observation  variable.  The  process  {Zt,  t=l,  ...,  T)  is 

called  the  observation  process.  So,  the  observation 

variable  Zt  takes  values  from  the  observation  set  O, 

0={1,  2,  ...,  K).  A  policy  a  is  called  a  policy  with  respect 

to  the  partition  S,  if  aiat=otkai  for  all  icSk,  aeA,  kEO,  t=l,...,T. 

and  satisfies 
M 

S  for  all  kEO,  t=l, ...,  T  (1.1a) 

a=l 

ttkai^  0  for  all  kEO,  aEA,  t=l . T  (1.1b) 

In  the  remaining  part  of  this  report,  we  assume  that  we  are 
given  a  fixed  partition  S  of  the  state  space.  Also,  we  mean 
completely  observable  MDP  by  an  unrestricted  MDP  and  the 
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MDP  under  observability  constraints  by  the  restricted  MDP 
with  respect  to  partition  S.  Clearly,  we  must  have 
aiai=P(At=al  Xt=i)  for  asA,  ieSk,  k£0,  t=l, ....  T  (1.2) 

=P(At=al  XtCSk) 

=P(At=al  Zt=k}  for  aeA,  k£0,  t=i . T 


=Oikat 


The  aim  is  to  find  a  policy  a  *  with  respect  to  partition  S  that 
minimizes  expected  total  discounted  cost  over  T-period 
horizon.  The  same  problem  under  infinite  horizon  is  studied 
in  Serin(1989)  and  Kulkami  and  Serin(1990).  This  problem 
can  also  be  considered  as  a  partially  observable  MDP 
problem  (Monahan,  (1982)).  The  methodology  used  here  is 
different  from  partially  observable  Markov  Decision  Process 
methodology. 

2.  MODEL  AND  THE  SOLUTION  METHOD:  The  optimal 


policy  for  the  unrestricted  MDP  problem  can  be  found  by  the 
probabilistic  dynamic  programming  coving  backward  period 
by  period.  The  optimal  expected  total  discounted  cost  of  a 
policy  cx  over  a  t-period  planning  horizon  starting  with  an 
initial  state  i,  Vit,  satisfies 


v*ir 


minimum 

aEA 


N 


j-1 


(2.1) 


for  all  iES  and  t=l,  T,  where  Y  the  discount  factor.  A 
constant  value  is  assigned  to  vj'o’s  (Hillier  and  Lieberman, 
(1974)),  e.g.,  v*o  =  0. 

We  can  state  the  expected  total  discounted  cost  minimization 
problem  for  unrestricted  MDP  by  the  following  linear 
programming  problem  : 
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N  M  ^  T  \ 

Minimize  X  S  ^iaj  S  yiat)  (2.2a) 

i=la*l  |t=tl  I 

subject  to 

M 

SyiaT  =  Pi  foralliES  (2.2b) 

a=l 

M  (  N  \ 

yiai-  YSyja(t+i)Pji(a)  /  =0  foralliES,  t=l . T-1  (2.2c) 

a=l  i  j=l  j 

yjat^O  forall  iES,  aEA,  t=l . T  (2. 2d) 

where  the  decision  variable  yiat  can  be  interpreted  as  the 

discounted  probability  of  being  in  state  i  and  taking  action  a 

in  period  t,  pj  is  the  probability  of  being  at  state  i  at  the 

beginning  of  the  planning  horizon.  Then,  the  optimal  solution: 
yfat=Y^^'^^  Pa*(Xi=i,  At=a)  for  all  iES,  aEA,  t=l, ...,  T  (2.3) 
The  optimal  policy  a*  is  given  by 

for  all  iES,  aEA,  t=l . T  (2.4) 

M 

2  y*iat 

a=I 

=  Pot*(A,=al  X,=i) 
and  a*  satisfies  (I. la)  and  (I. lb). 

At  a  basic  optimal  solution,  ytat  can  take  a  positive  value  for 
at  most  one  action,  which  is  in  accordance  with  the 
implication  of  recursion  (2.1),  i.e.,  the  optimal  policy  is 
deterministic.  If  it  is  not  possible  to  be  in  state  i  at  some 
period,  some  arbitrary  action  is  assigned  to  that  state  (Ross, 
(1989)). 

Now,  we  may  define  Wh  as  the  discounted  probability  of 
being  in  state  i  at  period  t  under  a  given  policy  a ,  i.e.. 
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Wit  =  Pa(Xt=i)  for  all  ieS,  t=l, T  (2.5) 

yut  =  Wit  ajat  for  all  icS,  aeA,  t=l . T  (2.6) 

Now,  we  are  ready  to  consider  the  above  MDP  under 
observability  constraints.  Suppose  that  (Zti  t=l,  ....  T}  is  the 
observation  process  defined  over  0={1,...,  K)  characterized  by 
a  partition  S=(Si,  ....  Sk).  If  <X  is  a  policy  with  respect  to 
partition  S,  then  the  probability  of  taking  action  a  at  some 
period  t  is  the  same  for  all  the  states  in  the  same  subset. 
Then,  observability  constraints  in  (1.2)  with  respect  to 
partition  S  are  introduced  by  imposing 

aiat=oijat  for  ail  i,  j  pair  in  the  same  subset 
=akat  for  all  i,  jcSk 

to  the  feasible  policy  space  of  unrestricted  MDP  problem. 
Pij(cx.  0  is  the  probability  of  being  in  state  j  when  there  are 
(t-1)  periods  to  go  until  the  end  of  the  planning  horizon, 
given  that  the  system  is  in  state  i  when  there  are  t  periods  to 
go  until  the  end  of  the  planning  horizon  and  when  the  policy 
a  is  employed, 

Pij(a,  t)  =  Pa(X,.i=jl  Xi=i) 

M 

=  X  ®k(i)atPi/a)  for  ail  •.  T  and  i,  jeS  (2.7) 

a«l 

and  Cjg(a)  is  the  expected  immediate  cost  incurred  under 
policy  a ,  given  that  the  system  is  in  state  i  at  the  beginning 
of  the  period  t 

M 

Ci^a)  =  X  <^k(i)iiCia  for  all  i£S,  t=l, ...,  T  (2.8) 

aal 

and  the  cost  vector  in  period  t  is 

ct(a )  =  (cit(a ),  C2t(a ). ....  CNt(oi ))’  (2.9) 
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The  optimal  policy  a*  with  respect  to  partition  S  for  a  MDP  is 
given  by  the  solution  of  the  following  problem. 

Problem  D: 

Minimize 


/ 

(T-t)  , 

<t>(a)=p'  CT<a)  +  X  Y  P(a.T)...P(a.t+l)Ci(a)|  (2.10a) 

t=i 


subject  to 

M 

for  all  keO,  t=I, ...,  T 

(2.10b) 

a=l 

ajjat^O  for  all  keO,  aeA,  t=l, ...,  T 

(2.10c) 

Note  that  there  is  a  deterministic  global  optimal  policy  to 
Problem  D. 

In  order  to  obtain  a  solution  to  this  problem,  we  use  the 
method  of  feasible  directions  (Bazaraa  and  Shetty,  (1979)). 
The  algorithm  we  develop  iterates  between  deterministic 
policies,  using  the  fact  that  there  exists  a  deterministic  global 
optimal  policy  to  Problem  D.  In  order  to  guarantee 
improvement  at  each  iteration  from  one  deterministic  policy 
to  another,  a  descent  direction  is  selected  in  such  a  way  that 
the  policy  improvement  is  achieved  through  changes  in  the 
partial  policy  of  only  one  period,  although  there  may  be 
other  periods  implying  improvement,  i.e.,  contributing  the 
directional  derivative  with  a  negative  value.  Proceeding 
along  such  a  direction  causes  improvement  at  a  constant  rate. 
Then,  if  the  search  procedure  starts  with  a  deterministic 
policy,  iterations  occur  between  deterministic  policies  by 
taking  a  step  of  size  one  at  each  iteration. 
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As  in  the  case  of  policy  iteration  algorithm  of  Howard(1971) 
for  unrestricted  MDP,  the  algorithm  proceeds  along  the 
steepest  descent  directions  satisfying  above  conditions. 

3. CONCLUSION:  Using  feasible  descent  directions  that 
change  the  partial  policy  of  only  one  period  at  an  iteration 
may  cause  the  algorithm  to  terminate  after  a  large  number 
of  iterations.  Another  disadvantage  of  the  algorithm  is  the 
risk  of  termination  with  a  deterministic  local  optimum  or 
saddle  point  in  spite  of  the  fact  that  there  may  exist  a 
randomized  local  optimum  or  a  saddle  point  with  a  lower 
expected  cost.  The  reason  is  that,  algorithm  does  not  take 
randomized  policies  into  account.  Along  the  line  between  two 
deterministic  policies  of  two  successive  iterations  of  the 
algorithm,  there  can  not  be  any  point  satisfying  necessary 
Kuhn-Tucker  conditions,  because  the  expected  cost  function 
decreases  linearly.  However,  there  can  be  randomized 
policies  which  do  not  lie  on  any  such  line. 

We  propose  another  algorithm  for  solving  Problem  D  which 
allows  changes  in  partial  policy  of  every  period  in  an 
iteration  and  proceeds  along  the  steepest  descent  directions. 
Directions  making  changes  in  more  than  one  period  cause  the 
expected  cost  function  to  be  a  nonlinear  function  of  step  size. 
Then,  for  minimizing  the  cost  function  along  such  directions, 
the  policy  improvement  step  must  include  a  line  search, 
which  is  the  computational  burden  of  this  algorithm  and  may 
slow  the  algorithm  in  terms  of  the  computation  time 
required  until  termination.  On  the  other  hand,  it  may 
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decrease  the  number  of  iterations.  In  that  case,  randomized 
policies  can  also  be  encountered. 
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1.  Introduction 

Consider  a  connected  undirected  network  G  =  (AfU  {0),EJ)  with  a  set  of  nodes 
AU  {0}  and  a  set  of  arcs  E.  The  subset  D,  D  C  N,  is  the  set  of  communities  (users). 
A  common  supplier  0,  provides  service  which  is  required  by  the  communities,  and  any 
community  receiving  the  service  can  in  turn  deliver  it  to  adjacent  communities.  Each 
community  in  D  is  required  to  be  connected,  perhaps  through  other  communities,  to  a 
common  supplier.  There  is  a  cost,  w((t,j))  =  w-j  >  0  ,  (i,;)  €  E,  if  arc  (i,;)  is  used  to 
deliver  service.  The  objective  is  to  provide  service  to  the  communities  in  at  a 
minimum  cost.  We  will  refer  to  the  above  optimization  problem  as  the  minimum  cost 
Steiner  Tree  (5T)  problem. 

We  provide  in  this  paper  a  computational  analysis  of  a  game  theoretic 
approach  to  a  cost  allocation  problem  arising  in  a  minimum  cost  5r-problem.  The 
cost  allocation  is  concerned  with  the  fair  distribution  of  the  cost  of  providing  the 
service  among  customers.  We  formulate  this  cost  allocation  problem  as  a  cost 
cooperative  game  in  characteristic  function  form,  referred  to  as  the  ST-game.  In 
general,  the  ST-game  generalizes  several  cooperative  games  studied  in  the  literature 
which  were  used  to  analyze  a  variety  of  cost  allocation  problems.  For  example,  the 
class  of  5r-games  properly  generalizes  the  class  of  minimum  cost  spanning  tree  games 
(Bird  (1976),  Granot  and  Iluberman  (1981,  1984)),  tree  games  (Megiddo  (1978))  and 
airport  games  (Littlechild  (1974)).  57’-game  is  equivalent  to  the  Fixed  Cost  Spanning 
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Forest  (FCSF)  game  which  was  studied  by  D.  Granot  and  F.  Granot  (1990)  for  a 
special  case  when  the  underlying  network  G  has  a  tree  structure.  They  show  that  in 
this  very  special  case  the  core  of  a  FCSF  game  is  not  empty.  We  extend  the  analysis  to 
general  networks.  Work  in  this  paper  is  also  related  to  Sharkey’s  (1990)  study  of  the 
shared  facility  game.  Therein,  he  deflnes  a  simple  game,  and  shows  that  the  core  of  a 
simple  game  is  nonempty  if  and  only  if  the  optimal  values  of  the  respective  objective 
functions  of  associated  IP  (Integer  Program)  and  LP  (Linear  Program)  are  equal.  Here, 
we  analyze  the  relationship  between  certain  IP  and  LP  associated  with  the  5T-game 
(note  that  ^T-game  is  not  a  simple  game). 

It  is  shown  that  in  general  the  core  of  a  ST-game  may  be  empty.  Our  main 
lesult  provides  a  siifficicnt  (and  in  some  cases  necessary)  condition  for  the 
nonemptiness  of  the  core  of  the  ST-g&me.  It  turns  out  that  the  core  is  not  empty  if  the 
incidence  vector  of  an  optimal  ST  coincides  with  an  optimal  solution  to  a  certain  linear 
programming  problem.  We  also  show  that  the  reverse  is  not  necessarily  true.  Further, 
given  an  optimal  ST,  we  construct  an  0{n^)  algorithm  (where  n  is  the  number  of 
nodes)  which  verifies  whether  the  above  sufficient  condition  is  satisfied.  Moreover,  if 
the  answer  to  the  above  algorithm  is  positive  it  generates  a  cost  allocation  vector  in 
the  core. 

This  extended  abstract  is  organized  as  follows.  In  Section  2  we  review  some 
standard  definitions  and  introduce  some  notation.  In  Section  3  we  provide  sufficient 
condition  for  the  nonemptiness  of  the  core  of  the  ST-  game.  In  Section  4  we  present  an 
efllcient  algorithm  to  check  whether  the  sufficient  condition  for  the  existence  of  the 
core  allocation  is  satisfied  and  in  case  of  favorable  answer  we  provide  a  point  in  the 
core  of  the  5T-game.  Finally,  in  Section  5  we  give  some  concluding  remarks. 

2  Definitions  and  Preliminaries 

The  minimum  cost  ST  problem  can  also  be  formulated  for  directed  graphs. 
The  minimum  cost  Directed  Steiner  Tree  {DST}  problem  is  defined  with  respect  to  a 
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directed  weighted  graph  G  =  {0},£)  with  a  weight  function  w:  E  —*  R'^.  Nameiy, 

given  a  subset  of  nodes  DQ  N;  find  a  directed  tree  T  =  {NjyU  {0} ,Eip)  in  G,  rooted 
away  from  node  O  and  whose  node  set  contains  D,  such  that  the  total  edge-weight  of 
T  is  minimum.  It  is  clear  that  any  mininaim  cost  ST  problem  can  be  solved  by 
considering  an  appropriate  minimum  cost  DST  problem,  obtained  by  replacing  each 
edge  of  the  given  network  by  two  arcs  of  opposite  directions. 

In  order  to  analyze  the  cost  allocation  problem  associated  with  the  minimum 
cost  ST  problem,  we  formulate  this  cost  allocation  problem  as  a  cooperative  game. 
Consider  ST  problem  on  a  network  (f  =  (^U{0},  E),  with  a  set  o;  users  DGN. 
Denote  by  STq,  for  QC  D,  the  ST  problem  obtained  from  the  original  problem  by 
simply  replacing  D  by  Q.  Then,  the  pair  (D,c),  where  is  such  that  c(0)  =  0 

and  for  each  QC  D  c(Q)  is  the  minimum  objective  function  value  of  STq,  is  a  game 
to  be  referred  to  as  the  5T-game.  For  zE  and  QC  £*,  let  ^ 

can  interpret  x{Q)  as  the  part  of  the  total  cost  paid  by  the  coalition  Q.  A  coot 
allocation  vector  z  in  a  game  {D\c)  satisfies  x{D)  =  c{D),  and  the  solution  theory  of 
cooperative  games  is  concerned  with  the  selection  of  a  reasonable  subset  of  cost 
allocation  vectors. 

Central  to  the  solution  theory  of  cooperative  games  is  the  concept  of  solution 
referred  to  as  the  core  of  a  game.  The  cote  of  a  game  {D;c)  consists  of  all  vectors 
x€  such  that  z(Q)  <  c(Q)  for  all  QC  D,  and  z{D)  =  c{D).  Observe  that  the  core 
consists  of  all  allocation  vectors  x  which  provide  no  incentive  for  any  coalition  to 
secede. 


3  The  Core  of  the  ST-Game 

It  was  shown  by  D.  Granot  and  G.  Huberman  (1981)  that  the  core  of  the  ST- 
game  is  not  empty  when  all  nodes  are  communities  i.e  D=  N.  Unfortunately,  this 
result  cannot  be  extended  to  cases  when  N.  Indeed,  a  simple  example  below 
(shown  to  me  by  A.  Tamir)  demonstrates  that  the  core  of  a  ST-game  may  be  empty. 
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Explicitly,  consider  the  directed  network  G  =  {NU{0),E)  shown  in  Fig.  3.1 
below.  Therein,  A^={I,2,3,4,5,6}  ,  and  Z7={1,2,3}  is  the  set  of  communities. 
Further,  we  assume  that  u/(t,y)  =  1  for  all  (i,;)  €  E. 

The  core,  C,(D,c),  of-.the  5T-game  associated  with  network  G  is  given  by: 
C(D,c)  =  {ze  <  2,  Z2  <2,  I3  <  2,  x^+z^  <3,  +  13  <  3, 


Now,  one  can  easily  verify  that  the  core  constraints  induced  by  the  three  two-members 
coalitions  imply  that  <  4|  for  any  core  allocation.  Thus,  since  any  core 

allocation  z  must  distribute  the  entire  cost,  i.e.,  Zj  -I-  Xj  -f  Z3  =  5,  we  conclude  that  the 
core  of  the  ST  game  associated  with  G  ,  displayed  in  Fig.  3.1,  is  empty. 

Below,  we  provide  a  sufficient  condition  for  the  noncmptincss  of  the  core  of  the 
5T-game.  It  is  based  on  integer  programming  formulation  of  the  minimum  cost  DST 
problem  used  in  Prodon  et.  al.  (1985).  To  describe  their  formulation,  as  applied  to  our 
minimum  cost  BST  problem,  we  need  the  following  notation.  Let  G  =  (^U  {O})  be  a 
directed  graph  and  D,  DC  N  the  set  of  communities.  For  a  directed  edge  /  =  ( i,j)  we 
refer  to  1  as  the  tail  and  ;  as  a  head  of  /  ,  and  for  a  subset  of  vertices  S,  SC  N,  we 
denote  by  6{S)  the  set  of  all  directed  edges  having  their  heads,  but  not  their  tails,  in  S. 
A  subset  S,  SC  N,  is  said  to  be  an  admissible  cut-set  of  G,  if,  5n  ^  0  (Z?  is  the  set 
of  communities)  and  both  subgraphs  G{S)  and  G(A^U{0}|5)  of  G  induced  by  S  and 
^U{0}|5,  respectively,  are  connected.  We  denote  by  A  the  set  of  all  admissible  cut¬ 
sets  of  G.  Now  the  DST  problem  can  be  formulated  as  the  following  integer 


533 


programming  problem:  IP(D)  i  min  |cx  :  i(i(5))  >  1,  5€ /I,  5n  D  0,  x  €  {0,1}}  • 

Then,  our  ST-g&me  based  on  the  above  formulation  of  the  DST  problem  is  the 
pair  (D,c),  where  c:2^^^-*R  is  such  that  c(0)  =  0  and  for  each  QCD,  c(Q)  is  the 
minimum  objective  function  value  of  IP{Q). 

Clearly,  the  exponential  number  of  core  constraints,  coupled  with  the  fact  that 
IP[Q)  is  NP-hard  whenever  2  <  |Q|  <  |JV],  makes  the  core  computations  hard.  We 
provide  some  shortcuts  that  enable  efficient  computation  in  certain  cases. 

Consider  the  linear  programming  relaxation  LP{D)  of  IP{D)  defined  as  follows: 

LP(D)  :  min  |cx  ;  x(^(5))  >  1,S  €  A,  5nZ)/0,  r>o| 

In,  view  of  favorable  computational  results  obtained  by  Prodon  et  al.  (1985)  and 
Chopra  et.  al.  (1992)  with  instances  of  the  DST  problem  on  general  graphs,  they  were 
led  to  conjecture  that  the  inequalities  describing  the  feasible  region  of  LP{D)),  while 
not  sufficient  to  describe  the  polyhedron  for  the  DST  problem,  are  nevertheless 
important  in  the  sense  that  they  often  produce  optimal  integer  extreme  points. 
However,  notwithstanding  this  favorable  computational  experience,  LP{D)  would  fail 
to  produce  optimal  directed  Steiner  trees  even  for  tlie  very  simple  case  like  network  G 
in  Fig.  3.1.  Indeed,  consider  the  directed  network  G  given  in  Fig.  3.1.  All  the  edge 
weights  in  G  are  assumed  to  be  1  .  Then,  it  is  easy  to  check  that  the  minimum  cost 
DST  in  G,  with  root  O  and  whose  vertex  set  contains  vertices  0,  1,2,  3,  has  a  weight 
of  5.  However  x*  €  defined  as  follows:  i(o,5)~2’  ^(0.6)~I'  ^(*4,1)“?’ 

*(4,2)“f’  ^(5,2)“5'  *(5,3)~I’  ^(6,1)“!'  ^{6,3)~h  otherwise,  is  feasible  to 

LP{D)  associated  with  G  and  has  a  lower  objective  function  value  of  4.5  . 

Nevertheless,  LP{D)  is  still  useful  for  the  analysis  of  the  ST-game.  Indeed,  we 
show  that  sufficient  condition  for  nonemptiness  of  the  core  of  the  57'-game  is  that 
linear  programming  relaxation  of  LP{D)  has  integral  optimal  solution. 

THEOREM  3.1  If  the  incidence  vector  of  a  minimum  cost  DST  in  G,  rooted  away 
from  0  and  whose  vertex  set  contains  D,  is  an  optimal  solution  to  LP{D)  ,  then  the 
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core  of  llie  associated  5T>gaine  is  not  empty.  □ 

Next  we  siiow  that  the  sufficient  condition  given  in  T.3.1,  in  general,  is  not 
necessary  for  the  nonemptiness  of  the  core. 


Indeed,  consider  the  network  U  {O},  E^)  in  Fig.  3.2.  Assume  that  all  edge 


weights  in  Gy  arc  1  and  let  Dy  =  {5,6,7}  be  the  set  of  users.  An  optimal  solution  to 
IP{Dy)  is  indicated  by  bold  arcs  and  has  total  weight  c{Dj)  =  G.  It  is  easy  to  check 


that  vector  =  (2,2,2)  is  in  the  core  C(Di,cJ  of  the  associated  ^T-game.  On 

JE 

the  other  hand  one  can  verify  that  i*  G  defined  as  follows:  2)=j» 

T*  — i  X*  — i  T*  — i  *•  —I  X*  —1  X*  — i  X*  —I  X*  — i  anH  r*  — fl 

*(1,3)~2'  12,5)“2»  *(2,6)~5'  13,6)~2’  *(3,7)~2’  14,5)~5»  ®(4,7)“2' 

otherwise,  is  feasible  to  LP{Dy)  associated  with  Gy  and  has  the  objective  function 


value  of  5.5. 


4  The  Core  Algorithm 

Whenever  we  can  find  the  optimal  minimum  cost  ST,  we  can  efficiently  test 
whether  the  sufficient  condition  for  the  nonemptiness  of  the  core  of  the  57'-game,  given 
by  T.3.1  is  satisfied.  We  construct  the  algorithm  which  is  a  modification  of  Prodon’s 
et.  al.  (1985)  T-guided  heuristic  for  finding  a  minimum  cost  DST.  We  prove  that  our 
algorithm  will  terminate  with  a  dual  feasible  solution,  whose  corresponding  objective 
function  value  is  equal  to  the  weight  of  T  if  and  only  if  the  incidence  vector  of  T  is  an 
optimal  solution  to  LP{D),  In  addition  to  that,  if  the  algorithm  delivers  optimal  dual 
solution,  it  also  immediately  generates  point(8)  in  the  core.  The  algorithm  uses  only 
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quadratic  number  of  simple  constraints  and  overall  takes  0(n^)  time. 

Let  T=:  be  an  optimal  DST  rooted  in  0.  Number  nodes  in  T 

starting  with  assigning  zero  to  root  0  and  }  to  nodes  in  Nji  in  such  a  way 

that  for  every  i€  N>p  all  successors  of  k  have  numbers  greater  then  k.  For  k€  Nrp,  let 
Tj^  =  (Nj^U  {p{k)},  Ef.)  be  the  subarborescence  of  T  rooted  in  tlie  unique  immediate 
predecessor  p{k)nf  k.  The  following  algorithm  often  produces  point  in  the  core  C{D,c). 

THE  CX)RE  ALGORITHM 

For  k  =  I  N'p\  .  .,1,  scan  k  as  follows: 
b<^in 

while  w(p(k),k)  >  0  and  there  exist  SG  A  such  that 
S  n(AjyU  {0}  \Nj.)  =  0  and  u)(e)  >  0,  for  all  e  6  6(5)  do 
find  minimal  such  set  5, 
yo  =  min{w(e)  :  c€6(5)}, 
iil(e)  =  u^e)  —  yg  for  all  e  €  6(5), 

pick  up  an  arbitrary  node  iG  DnS  and  update  z^=  +  y^, 

end 

end 

TnEX)REM  4.1  Let  xG  be  the  vector  obtained  by  the  core  algorithm.  Then  for 
all  SCO,  x(S)<  c(5).  □ 

Let  A  represent  the  maximum  fraction  of  total  cost  that  can  be  distributed 
while  satisfying  the  core  constraints.  That  is  : 

A  =  max  {i(Zl)  s.t.  i(5)  <  c(5)  for  all  SC  D}/c{D). 

(Bounds  on  A  for  some  special  cases  of  the  underlying  network  G  are  presented  in 
Sharkey  (1992)). 

COROLLARY  4.2  Let  x  €  R^  be  the  vector  obtained  by  the  core  algorithm.  Then 
x(D)/c{D)  is  a  lower  bound  on  value  of  A.  □ 

THEOREM  4.3  The  vector  z€  R^  obtained  by  the  core  algorithm  is  in  the  C{D,c)  if 
and  only  if  the  incidence  vector  of  optimal  DST  is  optimal  solution  to  LP{D).  □ 
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5  Summary 

In  tins  paper  we  provided  suflicient  condition  for  the  nonemptincss  of  the  core 
of  a  5'r-gamc.  Wc  also  developed  an  efncient  algorithm  that  gives  us  a  lower  bound  of 
the  maximal  total  cost  that  can  be  distributed  while  satisfying  core  constraints.  We 
prove  that  this  algorithm  will  generate  point  in  the  core  if  and  only  if  the  optimal 
objective  function  values  of  associated  1P{D)  and  LP{D)  are  equal.  Computational 
experiments  with  DST  on  general  networks  performed  by  Chopra  et.  al.  (1992)  and 
Prodon  et.  a!.  (1985)  indirectly  confirm  that  our  algorithm  will  often  produce  core 
points  or  will  offer  a  good  approximation  for  a  core  point. 
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Abstulct 

In  this  paper  toe  examine  a  class  of  nonlin¬ 
ear,  stochastic  knapsack  problems  which  occur  in 
manufacturing,  facility  or  other  network  design 
applications.  Series,  merge  &  split  topologies 
of  series-parallel  M/M/l/K  and  M/M/CfK 
queueing  networks  with  an  overall  buffer  con¬ 
straint  bound  are  examined.  Bounds  on  the 
objective  function  are  proposed  and  a  sensitiv¬ 
ity  analysis  is  utilized  to  quantify  the  effects  of 
buffer  variations  on  network  performance  mea¬ 
sures. 

Ketmords  —  Buffw  ABo  cition.  Sto  chntic,  N«iilin«ar  Kiw^ 
MCfc 

1  INTRODUCTION 

SlochaiUc  network!  of  letTiee  centers  with  rarisble  ser¬ 
vice  rates  and  finite  waiting  capacities  (buffers)  occur  in 
many  network  design  applications  such  as  manufacturing 
facilities,  communication  networks  and  Tehicnlar  traffic 
systems,.  One  of  the  most  challenging  tasks  of  the  net¬ 
work  designer  is  to  allocate  buffers  at  each  service  center 
while  keeping  in  mind  the  total  capacity  of  the  network. 

2  CoRSTaawBD  Nctwou  Dasian  (CND) 

PaOBLBM 

This  section  discusses  the  research  problem,  the  inherent 
complezitiea  of  the  problem,  various  optimisation  tech¬ 
niques,  then  proposes  a  methodology  to  solve  the  prob¬ 
lem. 

The  analysis  of  a  qneneing  network  is  highly  dependent 
on  the  topology  of  the  nodes  and  arcs  of  the  network.  We 
can  assume  a  Graph  G(V,  A,  T)  where 

‘Dtpsitauat  sf  ladaswial  FeglBseitin  and  Opetatisas  Re- 
•saieh,  Ualvarity  af  UasMckaactM  Amkant  Maatackaastu  OlOM 

DecNam 


V  =  A  finite  set  of  nodes( vertices )  which  represent  ma¬ 
chine  centers,  i  =  {1,2, .. .,  N) 

A  =  A  finite  set  of  arcs  which  may  represent  the  ma¬ 
terial  handling  transfer  systems. 

r  =  An  incidence  function  which  regulates  the  flow  or 
routing  of  entities  within  G. 

Any  qneneing  network  optimisation  problem  can  be 
decomposed  into  a  set  of  three  inter-related  optimisation 
ptoblenu  [24]. 

1.  TopologiaU  Network  Design  Problem  (TND)  The 
topology  of  the  network  can  be  identified  as  a  com¬ 
bination  of  one  or  all  of  the  three  classes  of  network 
topologies,  vis.  i)seiies,  ii)splitting,  and  iii)merging. 
This  problem  deals  with  finding  the  best  topology 
of  the  nodes  and  the  arcs. 

2.  Routing  Network  Design  Problem  (RND)  This  deals 
with  routing  of  the  flow  of  entities  along  the  arcs  in 
the  given  topology.  The  research  problem  studied 
in  this  paper  also  assumes  that  the  topology  and 
routing  of  entities  in  the  network  has  already  been 
identified. 

3.  Constrained  Network  Design  Problem  (CND)  The 
general  research  problem  is  concerned  with  the  al¬ 
location  of  resources,  such  as  the  number  of  servers, 
buffers  at  each  of  the  servers,  assuming  that  the 
TND  and  RND  problems  have  been  solved.  This  pa¬ 
per  deals  only  with  allocation  of  buffers  at  each  node 
in  the  network  given  the  constraint  that  the  sum  of 
the  number  of  the  buffers  for  the  entire  network  does 
not  exceed  a  maximum  limit  set  for  the  network.  An 
extension  to  the  simultaneous  problems  of  rooting 
and  buffer  allocation  is  also  discussed. 

t.l  Assumptions 

The  assumptions  for  the  analysis  are  os  follows 

1.  The  Graph  G(V’,  A,  F)  has  been  identified  i.e.  the 
fVF*  sf  topology,  numher  of  nodes,  arcs  connecting 
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the  HoJet  end  the  ranting  of  the  tniiUtt  teithin  G 
art  ftven. 

t.  There  art  N  nodes  in  the  layotU. 

S.  Once  the  lereiee  on  a  product  u  computed  at  the 
3*^  no^,  U  it  mated  to  the  node  mth  routinp 
prohahditp  r<y.  If  there  u  no  room  at  the  node, 
the  item  u  blocked. 

f.  ArritaU  to  the  netuiori  are  Poitton  with  mean  ar- 
ntal  rate  A  . 

5.  Service  timcj  at  the  node  are  etponenitaily  dtt- 
irtbuied  with  the  mean  raU  in . 

t.  The  firtt  node  in  the  network  it  never  etarved  and 
the  final  node  in  the  network  it  never  blocked. 


t.t  Uathemaiical  Model 

The  CND  problem  hae  the  following  baaic  objective  func¬ 
tion  for  general  network  topologiea: 


Maaimixe  Z  =  6(P  —  V)  —  SL 


It 

ti.  ^Xi<B 
lal 

Xi  >0  and  integer 

where: 

9  =  Average  throughput  of  the  topalogg 

P  =  Average  revenue/Uem 

V  =  Average  variabU  production  eott 

H  =  Average  holding  eatt/item 

Li  —  Number  of  vniU  at  node  t  at  tteadg  ttaie 

L  =  hverage  total  number  of  unit#  in  the 

production  line  at  f  teody  (tote 

B  =  Total  eapaeitf  allocated  to  the  network 

Xi  =  Capacity  (buffer)  allocated  at  each  node 

Cl  =  Cott  of  each  buffer  allocated  to  node  •  in  tA«  net¬ 
work 


C  =  Ci,i  =  \,...,N ,  Cott  of  each  buffer  in  the  net¬ 
work,  all  Cl  being  egual. 

m  =  lerviee  rate  of  node  i 

Ai  =  arrivoi  rate  to  ynene  i 

The  CND  problem  mentioned  above  ii  a  nonlinear 
•toehaitie  knapaack  problem.  One  of  the  featnrea  of  the 
CND  problem  which  makea  the  problem  very  challeng¬ 
ing  to  lolve  ii  that,  no  known  cloeed-form  ezpreadon  for 
eatimating  the  throoghpnta  in  arbitrarily  conAgnred  fl- 
nite  open  qneneing  networks  eziata.  This  feature  makes 
it  very  hard  to  control  the  design  variables  as  a  function 
of  the  variation  in  the  objective  function. 


t.S  Proposed  Methodologg 

Since  there  is  no  closed-form  ezpreasioa  for  estimating 
the  objective  function  of  a  finite  qneneing  network,  the 
only  way  of  finding  the  snb-oplimal  bnfler  allocation  is 
to  employ  an  iterative  method,  which  estimates  the  ob¬ 
jective  function  for  a  set  of  bufiets  at  each  iteration,  finds 
the  direction  of  optimally  and  changes  the  bnflers  at 
nodes  in  a  manner  such  that  the  objective  function  it 
increased  until  a  set  of  convergence  standards  are  satis¬ 
fied. 

One  of  the  keys  to  onr  study  here  it  to  develop  per¬ 
formance  bonndt  on  the  objective  fnnetioa  value  so  that 
when  the  optimal  search  procednte  is  carried  out,  we  can 
have  a  robust  and  stable  technique  for  searching  for  the 
optimal  values  of  the  design  variables. 

We  will  first  present  the  bound  for  the  methodology  for 
MfUjlfK  queues,  where  the  customer  is  lost  if  blocking 
occurs  in  the  topology,  then  the  bound  for  a  delay  system 
and,  finally,  the  bound  for  M/M/C/K  queues. 

3  Dksigm  Msthodolooy 
S.l  Introduction 

The  search  method  employed  for  arriving  at  the  sub- 
optimal  dedsioa  variables  is  the  Complex  Method  of 
BOX  (4],  where  the  independent  variables  ate  the  buffet 
values  at  the  nodes  and  the  objective  funetioa  is  the  ear¬ 
lier  mentioned  objective  fnnetioa  of  the  CND. 

The  BOX  method  is  a  derivative-free  sequential  search 
technique  which  conducts  an  iterative  search  for  the  op¬ 
timum  value  for  an  objective  fnnetioa  while  there  ate 
linear  or  non-linear  constraints  on  values  of  the  indepen¬ 
dent  variables. 

3.t  Storting  Solution 

One  of  the  features  of  the  BOX  method  is  that  the  user 
supplies  a  starting  feasible  point  (buffer  values)  and  this 
starting  solution  seta  the  search  pattern  for  rest  of  the 
iterations.  Since  the  pattern  of  allocation  of  buffers  in 
the  sub-optimal  solution  is  highly  dependent  on  the  ar¬ 
rival  rate,  service  rates  and  utilisation  at  the  nodes  and 
the  total  capacity  allocated  for  the  network,  it  is  very 
important  that  the  starting  point  supplied  to  the  search 
method  have  the  pattern  of  allocation  of  buffers  « 
to  that  in  the  ftnM  sub-optimal  solution. 

It  is  very  difflcnlt  for  the  user  to  supply  the  right 
starting  point  since  not  much  is  not  known  about  the 
dynmamie  behavior  of  the  queues  and  no  simple  knap¬ 
sack  heuristic  starting  solution  was  found  to  be  appro¬ 
priate.  At  this  stage,  the  authors  decided  to  use  the 
procedure  proposed  in  [•]  to  arrive  at  a  refined  final 
starting  point  for  the  constrained  srarch  method.  This 
procedure,  using  the  previous  bounding  methods  to  es- 
timalc  the  throughput  and  the  objective  fnnetioa  of  the 
network  and  Powell's  procedure  [21]  for  solving  nneoa- 
strained  nonlinear  programming  proUeau,  ealenlates  the 
allocation  of  buffer  at  the  nodes  baaed  on  the  arrival  rate, 
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•enrice  ntca  of  the  nodes,  loutinc  of  the  enuties  nno  the 
assumption  that  there  is  no  constraint  on  the  total  ca¬ 
pacity  of  the  network. 


3.3  General  Dtngn  Meihodologt 

The  CNO  problem  is  soWed  in  font  stages.  A  flowchart 

representation  is  giren  in  figure  1. 

Stage  1.0  (IniiialiMatim) 

1.1  JdenUft  (fie  Graph  G(V’,A,  T),  arrival  raU/t 

Kj,  j  =  l,...,fif,  wfierc  (M=  nomfier  of 
fvenee  in  a  merping  lopdogp)  ,  sersioe  roles 
in,  t  =  mUng  prefiaiililiei  aj,  j  = 

(K  =  nnmfier  of  ynenes  in  a  tpiiUing 
topologg)  and  the  total  eapaeitg  3  in  the  net¬ 
work  . 

1.2  Select  the  valves  0/7,  (,  beta  and  S. 

1.3  Set  the  limit  on  the  nsmfier  of  ileratione  for  (fie 
eeareh  method. 

Stage  3.0  (Iterationi  for  the  searcfi  pattern  for  BOX 

method) 

2.1  Select  a  etarting  point  X}  t  =  each 

that 

X}  >  0  and  ^  procedure  in 

step  i.I. 

DecNass 
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2.2  Obtain  X}‘  i  =  I,. .  .,If,  the  euo-optimai  bu^er 
allocation  at  the  nodes  usinp  the  procedure  in 
[9J  assuming  infinite  capacity  for  the  network. 

2.3  Obtain  proportionate  values  .V<  i  =  1,...,/'. 
at  the  nodes  such  that  they  meet  the  capaeiiy 
constraint  set  for  the  network. 

Let 

(i  =  1,...,1V)  =  buffers  allocated  to  the 
nodes  in  the  optimal  solution  tn  step  3.2 

B  =  total  capaetiy  for  the  network 

Xi  i  =  \,...,N  =  starting  solution  for  ike 

BOX  search  method 

then 

Xi  =  Xf  *  (B/(Af  -  3))  ifN<3 
and  X,  =  X}'  »  (B/(Af  -  N))  if  N  >  3 

Stage  3.0  (Obtain  a  continuous  sub-optimal  solution) 

Run  the  BOX  search  method  to  obtain  the  solution 
X;i  =  l,...,N  . 

Stage  4.0  (Obtain  an  inteper  sufi-optimol  solution)  17te 
sub-optimal  solution  obtained  in  step  S  will  be,  but 
for  an  eaeeption,  a  non-integer  solution. 

4.1  To  arrive  at  an  integer  solution,  form  2^  com- 
kinotions  vsinp  the  two  integers  closest  to  the 
valve  of  the  buffers  for  each  node. 

Let 

X(^'  be  the  largest  integer  less  than  Xf  and 
X{*'  be  the  lowest  integer  greater  than  X’ 

So  we  have  2"  corrAinations  of  integer  solu¬ 
tions. 

4.2  Of  the  above  2^  combinations,  only  those  wfiose 
sum  is  less  than  or  egual  to  B  art  selected. 
Thus  there  are  N  •  {tf  +  l)/2  ccmbinations. 
Oktain  the  objective  function  for  each  of  the 
combinations.  The  objective  function  for  each 
combination  is  calculated  using  the  expansion 
method. 

Stage  4.3  The  combination  which  returns  the  maa- 
imum  value  of  the  objective  function  it  the  sub- 
optimal  integer  solution. 

The  authors  conducted  approziinately  200  experi¬ 
ments  with  various  types  of  topologies  and  thcii  com¬ 
binations  and  found  the  above  described  methodology 
to  work  successfully  each  and  every  time.  The  results 
of  experiments  conducted  for  series,  splitting  and  merg¬ 
ing  topologies  are  described  and  validated  in  the  next 
section. 

4  Sdmmxry  Op  Results 

In  this  paper,  a  methodology  was  proposed  for  finding 
the  optimal  bulTer  allocation  in  a  constrained  network 
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deaifB  problem.  The  leinlti  io  the  pieeediag  tection* 
(how  thkt  thU  methodology  ii  eery  effeetiTc  in  provid¬ 
ing  local  optimal  or  (nb-optimal  buffet  alloeationa  to  the 
nodea  for  (criea,  merge  and  iplit  topologiea.  A  aeantivity 
aaalyzii  wae  alio  conducted  to  provide  iniighta  regarding 
the  behavior  of  cositrained  queueing  netwoeki. 
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Extended  Atxxtrect 

Interior  point  methods  more  or  less  close  method  of  following  the  central  path  have  been 
succesfully  applied  for  the  solution  of  large  linear  programs. 

Generalizing  the  notion  of  the  analytic  center  of  a  finite  system  of  linear  (convex,  analytic) 
inequalities  —  which  proved  to  be  of  central  importance  for  the  theory  of  interior  point  methods 
in  linear  (convex)  programming  -  -  we  define  an  analytic  center  for  convex  sets  ff  in  iZ"  defined 
as  feasible  sets,  corresponding  to  a  smooth,  p  parameter  family  of  convex,  (e.g.  quadratic, 
or  linear)  inequalities  1  <  p  <  n  —  1  .Connections  to  the  theory  of  (central  solutions  of)  the 
classical  moment  and  related  operator  extension  problems  as  well  as  to  relevant  notions  of 
atfine  difierential  and  integral  geometry  are  briefly  discussed.One  of  the  most  reassuring  fact 
is  that  the  "maximum  entropy”  solutions,  in  the  classical  moment  problems,  see  [1],  can  be 
interpreted  as  a  special  application  of  the  general  principle,  used  by  us  to  define  a  path  of  nice 
feasible  solutions  leading  to  the  set  of  the  optimal  ones,  see  below. 

For  the  solution  or  semiinfinite  linear  programs  (wich  arise  when  the  finite  index  set  is 
replaced  with  a  continuum  on  wich  one  is  interested  to  solve  say  an  optimal  approximation 
problem  or  a  moment  problem  the  commonly  used  methods  were  based  on  (adaptive)  dis¬ 
cretization  of  A  and  solution  of  the  arising  finite  but  large  linear  programs.  In  this  way  the 
smoothness,  analiticity  of  the  data  functions  (on  A)  has  not  been  used  i.e.  exploited  at  all, 
and  the  dimension  grow  drastically  when  accuracy  requirements  are  increased.  In  [l],  [2],  [3] 
we  outlined  methods  using  analytic  centers  and  central  path  for  solving  semiinfinite  convex 
programs. 

In  order  to  explain  our  approach  we  remind,  that  an  optimization  problem  is  easily  reduced 
to  a  one  parameter  family  of  feasibility  problems.  Therefore  we  think,  that  a  basic  problem  of 
numerical  convex  analysis  is  to  find  a  nice  solution  concept  for  (important  classes  of)  feasibility 
problems  with  feasible  sets,  say  of  the  following  type; 

:=  {x  =  >  0,0  e  A,e(i9,e,I?2^,p)  =  0,0  E  B],  (1.1) 

where  L  is  a  linear  operator,  ^  is  the  state  of  the  underlying  system,  Di  and  D2  are  linear, 
constant  difierential  operators  (applicable  to  the  functions  ((■)),  /(■)  being  concave  quadratic 
in  e(  )  being  linear  in  ((.DaO,  p  is  a  parameter,  A  and  B  are  “index  sets”,  the 

elements  of  which  have  often  the  interpretation  as  points  in  space  and  (or)  time.  "Nice” 
means,  that  this  solution  must  be  a  low  complexity  function  of  the  "data”,  i.e.  the  parameter 
p  defining  the  system  (of  inequalities  and  equalities),  which  can  easy  be  updated,  when  this 
system,  i.e.  its  parameters  are  changed  (usually  by  a  one  parameter  homotopy).  Section  2  is 
devoted  to  such  a  concept  of  nice  central  solution. 

The  second  point  is  here  that  reasonably  (i.e.  not  too)  complex  feasibility  problems  are 
those,  in  whidi  the  elementary  ineqtialities  and  equalities  are  simple,  i.e.  the  first  given  by  the 
positivity  of  a  linear  or  quadratic  function  on  the  unknowns,  while  the  equalities  being  linear 
in  X ,  moreover  the  dependence  of  an  individual  ineqiiality  (equality)  on  its  defining  parameter 
is  also  "simple”;  this  will  be  qualified  further  below,  e.g.  in  the  case  Di  =  D2  =  0  mainly 
by  requiring,  that  certain  integrals  of  the  ariong,  algebraically  simple  functions  over  the  given 
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sets,  A ,  B  can  be  easily  computed  by  simple  quadrature  (i.e.  “cubature” , . . . )  formulae  (within 
appropriate  accuracy).  By  requiring  (in  met  often  :  “exploiting”)  this,  we  ate  able  to  avoid 
the  blowing  up  of  the  dimension  of  the  linear  or  “quadratic”  programming  problems  arising  by 
brute  force  discretization  of  the  parameter  space  A).  Similarly  we  have  to  assume,  that  the 
"structure”  of  the  equality  constraints  imposed  on  the  unknowns  is  also  “simple”.  We  shall  see, 
that  in  this  approach  the  basic  problems  belong  to  the  realm  of  classical  analysis,  algebraically 
simple  analytic  functions  and  their  integrals  and  approximations  —  say  by  rational  functions, 
or  by  other  simple,  constructive  classes  of  rather  smooth  functions  —  playing  an  important 
roie:  the  effectivity  of  the  proposed  method  depends  on  how  quickly  we  are  able  to  follow, 

1. e.  continue  by  extrapolating  (i.e.  predicting)  the  homotopy  path  of  “nice”,  interior  solutions 
leading  to  an  optimal  solution. 

It  turns  out,  that  the  latter  problem  is  closely  connected  to  an  other  problem:  how  to  find 
“nice”,  relatively  tight  two  sided  elliptoidcd  approximations  (around  the  previously  defined 
“nice”,  central  solutions)  for  the  corresponding  feasible  sets,  in  fact  nicety  of  these  centres 
should  be  defined  as  to  include  the  existence  of  low  complexity  algorithms  for  constructing 
and  updating  these  ellipsoids.  There  are  several  reasons  for  imposing  these  requirements.  First 
of  all:  the  existence  of  such  approximations  turns  out  to  be  responsible  for  the  efiectivity  of 
the  corrector  phase  —  via  Newton’s  method  —  of  the  (homotopy)  path  following,  predictor- 
corrector  method,see  (5). 

2.  Basic  properties  of  analytic  centers 

The  (analytic)  center  x(/^’®)  of  the  convex  inequality  system  (1.1)  —  with  a  bounded 
feasible  set  having  a  nonempty  interior  in  /?" —  is  defined  as  the  (in  general)  unique 
solution  of  the  supremum  problem 

sup  $(x),$(r)  =  8up{  / log/(a,^,r>i(,p)<io|e(/J,(,I>2f,p)  =  0,x  =  I^,V/3  €  B},  (2.1) 

where  da  is  a  measure,  which  is  independent  on  but  may  depend  on  the  set  {/^};  we 
assume  here  —  just  for  simplicity  —  ,that  da  depends  only  on  A.  Notice,  that  (2.1)  is  a 
classical  Ehiler-Lagrange  type  variational  problem,  which  in  general  has  a  unique  solution, 
depending  analytically  on  the  parameter  p.  In  this  section  we  discard  the  dependence  of  / 
on  the  parameter  p  and  first  restrict  the  attention  to  the  case,  where  B  is  the  empty  set.  In 
general,  e.g.  if  A  is  a  finite  set  and  if  all  /(a,  ■)  are  linear,  or  if  all  /(a,  •)  are  concave  and 
(at  most)  quadratic  at  least  one  being  negative  definite,  the  function 

i{x)  =  J  log  f  {a,  x)da  (2.2) 

is  strongly  concave  over  P^.  To  assure  the  existence  of  the  integrals  in  (2.1)  and  those 
appearing  later  below  it  would  be  enough  to  assume  that  f{-,x)  and  its  derivatives  (up  to 
order  two)  are  continous  and  uniformly  (in  x  €  P^  )  bounded  over  A .  More  important  is  that 
in  the  proposed  methodology  (using  analytic  homotopies  along  centers)  we  need  (at  least  we 
should  like  to  get)  a  high  degree  of  smoothness  and  algebraic  simplicity,  therefore  a  nonsmooth 
constraint  of  the  type,  say 

'n«wff<(v)  <  1,  yi(y)  :=  | /•i(v)  1.*  =  1, •••.»» 


(2.3) 


will  be  replaced  by  a  set  of  smooth  contraints 


-%•</<>(»)<  t  = 

j 

i.  e.  we  set  I  =  (f/i,.  •  • , f7fc,j/)-  For  siinplicity  we  shall  assume  first  that  in  (2.1)  /(•)  = 
f{a,x),e  =  0,  L  =  id,  Di  =  0.  The  following  invariance  properties  of  this  "solution” 
concept  are  important: 

(1)  affine  invariance:  if  we  replace  /(ot,-)  by  /(or, •)»/(“»  1/)  •'=  /{oiiTy  +  t),  where  t  6 
fl",T  :  iZ"  —  iZ",detT  ^  0  then 

t  +  Tx{f^)  =  xif^)  (2.4), 

(2)  invariance  under  scaiingr.  if  we  replace  f{a,-)  by  /(a,  ■)  :=  k{a)f{a,-),  for  an  arbitrary 
function  k{a)  then 

x(/^)  =  x(/^)  (2.5). 

The  proof  of  (1)  is  easv  obtained  from  the  characterization  of  the  optimal  solutions  for 

(2.1) 

(2.6) 

J  /(a,®) 

A 

The  proof  of  (2)  is  immediate  from  the  “additivity"  of  the  log  function. 

A  further,  rather  useful  property  of  this  solution  concept  is  that  it  often  allows  to  find 
good  ellipeoidal  inner  or  (and)  onther  approximation*  lot  the  set  P*.  The  idea  is  simple: 
consider  the  function 

'i'^(x)  :=  exp(<^(x))  (2.7) 

and  the  set  — where  H  =  D2^^{x{f^))  is  the  Hessian  of  at  x(f^)  — 

Ein  :=  {z  I  -\z'Hz  <  ^*ix{f*))}.  (2.8) 

It  is  natural  to  expect  that  E  =  E{f*)  is  a  good  ellipsoidal  approximation  of  (wich  of 
course,  shares  the  invariance  properties  (1)  and  (2)  of  x{f*). 

Theorem  1 

Suppose  that  A  it  a  finite  set  of  cardinality  m  and  that  the  funetiont  f(a,  •),  a  €  A  are 
concave  and  quadratic,  then 

x(/'‘)  +  T^E{f*)  <P*<  x{f*)  +  ^/^E{f*)  (2.9) 


This  is  proved  in  [14].  The  suprising  point  is  here  the  independence  of  the  quality  of  this 
approximation  on  the  specific  data,  i.  e.  the  form  of  the  functions  /«,  or  €  A. 

An  important  property  of  the  function  9  is  that  it  is  concave  on  P^  ,if 
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,  a  condition  (normalization)  whicii  we  shall  assnme  below  (obviously  without  loss  of  general- 
ity).This  can  be  proved  by  computing  the  Hessian  of  4* 

^  ^  ~  a(a)<laij^  a(Qt)da)* 

— where  0(01)  =  —  using  the  inequality 

{J  da)^  J  a(a)a{ayda>  J  a(a)da{J  a(a)day, 

which  is  obtained  from  the  Cauchy-Schwartz  inequality  (multiplying — from  both  sides — with 
a  vector). 

Many  convex  feasibility  problems  are  written  in  "dual”  form,  known  as  a  finite  ,or  re¬ 
stricted  moment  problem:  given  =  |c(t)  |  t  G  T) ,  the  “set  of  feasible  solutions”  is  formed 
by  the  nonnegative  mass  distributions  ( densities  over  a  set  S 

Pf^(c'^)  =  {/i  I  c(t)  =  ^  K(t,  s)dM(i),  teT  dfi>0}.  (2.10) 

The  central  feasible  solution  is  defined  as  that  element  (if  it  exists  and  is  unique)  which  solves 
the  supremeum  problem 


8up{  J  log/i'(s)ds  I  fi  G  mtP^(c(-))}  (2.11) 

here  int  stands  (intuitively)  for  “interior  oP  and  means  (precisely)  that  only  those  mass 
distributions  are  regarded  which  have  a  log  integrable  density. 

If  A  is  a  finite  set,  A  =  {1, . . . ,  m)  and  /(a,  •)  are  linear  in  x,  say, 

/(a,x)  =  6a -a^x,  a=l,...,m 

then  introducing 

Hi  hi  -  aj X,  i=  l,...,m 

we  can  find  vectors  fci , . . . ,  A;m-n  €  iP*  end  scalars  ci , . . . ,  Cm-i*  such  that 

P^(c")  =  {h  |<  >=  cy,  H  €  p?}  (2.12) 

is  identical  with  the  set  of  vektors  {/"‘(x)  j  x  G  P{f^)}  (note  that  |  T  |=  IV  =  m  — 
n).Obviously  /’"(x(/'*))  yields  then  the  solution  of  problem  (2.11). 

A  further  strong  motivation  for  the  importance  of  the  solution  concept  (2.11) — thus  for 
(2.1) — Lb  that  for  some  rather  interesting  special  cases  (of  the  kernel  function  e.  g.  in 

the  class  of  the  Nevanlinna  Pick  type  moment  problems,  the  solution  (2.11)  —  known  as  the 
maximum  entropy  solution  —  can  be  exact/y(!)  computed  in  a  very  simple  way  in  0{N^)  (in 
fact  even  inO(Ariog  AT)  operations  and  recursively  in  N).  In  fact  this  classical  example  is  at 
the  root  of  recent  more  general  results  about  the  role  (existence,  applications)  of  central  or 
“maximum  entropy”  solutions  to  the  basic  optimization  problems,  see  (1],[6]  for  further 
references. 
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An  important  application  of  centers  is  for  solving  optimization  problems  of  the  type 


inf{/o(x)lx€P'‘}=:/o*=? 


(2.13) 


Umng  the  observation  that  /o(x(A))  \  fS  for  where  i(A)  is  the  center  of  the  extended 

system 

sup{log(A  -  /o(x))  +  ^  log/(o,i)da  1 1  G  A  >  /o(i)},  (2.14) 

the  idea  is  to  follow  the  homotopy  path  i(-)  from  Ao  to  by  a  predictor-corrector  method, 
where  in  the  corrector  steps  Newton’s  method  is  applied  to  the  system 


VM^)  ,  /  Vfjoi.x) 

r  Ja  fioi,x) 


da  =  0,  r  >  0. 


(2.15) 


Here  we  introduced  a  new  parametrization  of  the  central  path  by  r  =  A  —  /o(i(A))  sudi  that 
r  =  0  corresponds  to  the  optimum. 

3.  Definition  of  an  analytic  center  for  convex  sets  defined  as  the  intersections 
of  a  it  parameter  family  of  hai£spaces 

Let  Af  be  a  compact,  convex  set  in  H",  cgr(K)  its  center  of  gravity  is  the  solution 
X  =  x{K)  of  the  following  opyimization  problem: 

sup  4(i),4>(i)  :=  sup{  J log^(z)dz|x  =  J  zn{z)dz,  J n{z)dz  =  l,#i  >  0},  (3.1) 

where  —  for  simplicity  —  it  is  assumed,  that  the  sup  is  extended  over  all  mass  distributions, 
which  have  a  log-integrable  density.  In  fact  the  solution  of  the  inner  optimization  problem 
iseasily  obtained  (notie,  that  (3.1)  is  a  moment  problem  like  (2.10)-(2.11)):  for  given  x  the 
optimal  density  has  the  form  fi(z)  =  (SiLi  +  ®o)~* .  where  the  Lagrange  multipliers 
a,-  =  atj(x),  i  =  l,...,n  are  uniquely  determined  through  the  optimality  (and  positivity) 
conditions  (see  below)  —  by  the  strong  convexity  of  the  problem  — 

$^(x)  =  J  log(a^(x)z)dz,  ^  =  J  2{a^{x)z)~^dz,  (3.2) 


where  we  used  the  notations  z  »-*  (l,zi,...,z„),  o^z  =  53r=i®»®«  +  “o-  Without  loss 
of  generality  and  following  our  earlier  normalization  we  can  assume,  that  vol{K)  =  1  and 
cgr(K)  =  0,  this  implies,  that  o,-  =  0,  i  =  1, . . . ,  n,  cto  =  1  and  $(0)  =  1 . 

Pnrsuing  the  analogy  with  the  construction  (2.7)-(2.8)  we  are  led  to  consider  the  ellipsoid 

E:={z\-\<  i7*«(0)z, z  ><  1}.  (3.3) 

A 

This  ellipsoid  is  the  complete  analogon  of  (2.8), there  the  total  mass  is  m  (the  number  of 
summands  in  the  analogon  of  the  potential  function  (2.2)). 

Theorem  2.  For  on  arbitrary  convex,  compact  domain  K  in  R”  the  above  ellipeoid  — 
centered  at  the  oriyine,  the  centre  of  gravity  of  K  —  provides  an  (optimal  order)  two  tided 
approximation  of  K : 

CKC  coast  nE  (3.4) 
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ProofSee  [3], where  also  the  case  of  a  oaeparameter  faunily  of  linear  inequalities  is  dealt 
with  in  more  detail. 

Now  we  look  to  the  dual  of  the  above  approximation  (construction). 

Definition.  Let  P  be  a  convex  compact  set,  its  analytic  center,  c(P)  is  defined  as  that 
(uniquely  determined)  point,  which  becomes  the  center  of  gravity  of  the  dual  set  P" ,  when  it 
is  taken  as  the  center  of  duality; 

c(P)  =  cgr(/^P)).  (3.5) 

(Here  and  below  we  indicate  —  by  a  lower  index  —  the  point,  which  is  taken  as  the  center  of 
duality,  unless  it  is  the  origine.) 

Lemma.  Tht  above  point  c{P)  is  uniquelly  determined,  since  it  is  the  solution  of  a 
(strongly)  convex  minimization  problem:  it  minimizes  the  volumen  of  the  dual  set 

voHPt)=  I  (“) 

11^=1 


Proof.  Noting  that  Pff  =  {<^|m(^,P)  —  <  1},  where  m(<^,  P)  =  8up{^^x|i  G  P},  the 

validity  of  the  second  formula  being  immediate  differenciating  with  respect  to  x  we  get  the 
condition 

0  =  /  - — ;  '^.1  =  const  I <i>d4>,  (3.7) 

ll^ill  P* 

i.e.  that  x  is  the  center  of  gravity  of  P^.  The  unicity  of  the  fixp>oint  of  (3.5)  follows  from 
the  strong  convexity  of  the  function  f{m{4>,P)  -  <f>'^ x)~^ dtf>  in  x  over  P,  note  that  instead 
of  the  Lebesque  measure  we  could  take  here  any  measure  dt^,  which  has  at  least  n  positive 
weights  in  linearly  independent  *^oints”  <i>.  We  were  led  to  the  point  (3.5)  by  analogy  with 
(2.11)-(2.12),  having  in  mind  this,  latter  case  (of  a  discrete  measure),  where  the  potential 
function  is  proportional  to  the  volume  of  an  ellipsoid  containing  the  dual  polyhedron,  see  the 
references  to  earlier  papers  of  the  author  in  [2]  and  below.  If  we  describe  the  set  P  by  its  dual 
with  respect  to  the  point  c(P) 

P  =  {x\m(<t>,  P)  >  ^^x,  V<^  6  (3.8) 

then  c(P)  yields  the  maximum  of  the  potential  function; 

4(z)  =  J  log(m(<^,P)  -  <l>^x)d4>\ 

P* 


in  fact  the  above  lemma  shows,  that  c(P)  is  the  unique  fixed  point  of  the  map  x  —*  z(x) 
z(x)  =  argma««,(z),  ♦i’(z)  =  ^  log(m(^,  P)  -  ^^z)d^ 


°  /m(^,P)^^rx' 

PS  Pi 


indeed  the  equations 
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are  equivalent  (by  the  zero,  resp.  first  order)  ‘^omogenity”  of  the  integrands).  Moreover 
the  second  derivative  of  this  “optimal”  potential  function,  which  is  exp>ected  to  yield  the 
ellipsoidal  iq>ptoximation  for  this  inequality  system  according  to  the  construction  (2,7)-(2.8), 
is  just  the  matrix  M ,  i.e.  the  inverse  of  the  Himiliu-  (but  dual)  one  associated  to  c(P)  as  to  a 
center  of  gravity.  This  observation  (showing  the  primal-dual  coherence  of  our  constructions) 
is  a  generalization  of  an  analogous  connection  observed  earlier  for  finite  systems  of  linear  or 
convex  (quadratic)  inequalities. 

(x)  =  (£>2$^‘'(x)]-‘  for  X  =  c(P)  =  cgr(r^) 

In  the  latter  (finite)  case  the  connection  (equality)  between  the  volume  of  an  ellipsoid 
containing  the  dual  polyhedron  and  the  value  of  the  potential  function  #(x)  has  been 
known  earlier. 

3.0n  the  chosen  implementation 

Here  we  present  experiences  with  a  class  of  methods,  whose  distinctive  features  of  imple¬ 
mentation  can  be  summarized  as  follows 

1. )Path  following  using  the  differential  equation  of  the  central  and  parallel  paths  (partic¬ 
ularly  its  simplest  first  order  implementation:  the  afiSne  scaling  direction) 

2. )  Exploiting  the  analyticity  of  the  elementary  bounds  and  their  dependence  on  the 
semiinfinite  parameter  a  by  using  high  order  quadrature  (and  path  extrapolation). 

3. )  Dynamic,  adaptiv  selection  of  the  approximate  finite  set  of  nodes  to  be  used  for 
checking  feasibilify  of  the  extrapolated  points,  this  is  used  for  the  selecting  the  stepsizes. 
These  nodes  are  provided  by  the  use  of  adaptive  quadrature  algoritms,  see  e.g.  [5],  to  select 
(i.e.  concentrate  asymptotically)  the  integration  nodes  in  the  subdomain,  where  the  constraints 
are  “active”. 

4. )  Regulation  of  the  step  size  either  with  adopting  a  continuous  recentering  strategy  or 
(in  the  first  order  case)  by  advancing  with  a  constant  portion  of  the  distance  to  the  boundary 
(along  the  extrapolation  line). 

It  shoxild  be  noted,  that  for  seminfinite  problems  especially  over  a  higher  dimensional  set 
A,  the  computation  of 

m^f{a,x)  =  d{x) 

which  would  be  needed  for  an  exact  “estimation”  of  the  “feasibility”  (distance  to  the  boimdary) 
and  which  —  by  the  way  —  is  also  needed  to  accomplish  a  usual  pivot  step  in  the  simplex 
method  may  be  rather  difficult. 

Since  in  a  this  function  may  be  neither  concave  nor  convex,  the  computation  of  the 
above  value  d{x)  is  to  be  avoided.  We  can  solve  drcumrent  this  problem  in  two  ways;  either 
by  adopting  a  close  following  of  the  central  path,  measuring  the  “distance”  from  it  by  a 
specific  quantity  (which  comes  out  from  the  well  known  analysis  of  the  convergence  domain  of 
Newton’s  method  for  computing  the  centers)  for  monitoring  the  stepsize  selections,  or  we  can 
use  an  adaptive  discretization  of  the  set  A,  which  is  automatically  generated  by  the  adaptive 
quadrature  method  (by  adding  all  the  dyadic  in  A  =  [a,  6]  nodes,  whidi  have  been  generated 
for  the  various  components  of  the  functions  to  be  integrated.  We  did  not  try  to  implement  a 
primal-dual  procedure,  since  the  dual  variable  here  is  a  fimction. 

In  the  linear  case,  when  f{a,x)  =  6(a)  —  a^(a)z  we  have  the  dual  problem 

max{-  J  (a)dn{a)\ci  =  j  ai{a)d/i{a),  t  =  1, . . . ,  n,  d/i(  )  >  0} 
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known  as  a  moment  problem,  and  along  the  central  path  x  =  z(r),  r  >  0  we  generate  feasible 
solutions  /i(-)  of  the  form 

^  6(a)  —  o^(a)z^ 

which  aproach  6-type  (atomic)  measures,  as  we  approadi  to  a  (nondegenerate)  optimum.  The 
nodes  generated  by  the  adaptive  quadrature  method  will  be  concentrate  more  and  more  around 
these  points. 

The  problem  of  the  optimal  selection  of  the  underlying  measure  da  A  will  be  discussed 
in  connection  to  affine  invariance.  We  shall  present  test  results  also  for  the  case  when  all 
constraints  /(a,  z)  are  quadratic  in  x  (arising  from  optional  control  problems  with  *^intwise” 
state  and  control  bounds).  In  the  MATLAB  code  we  tried  to  use  as  nmch  parallelity  as  possible 
(in  evaluating  the  functions,  integrals,  directions  as  vectors  over  the  current  index  set,  the 
adaptive  quadrature  algorithm  is  hard  to  be  integrated  in  such  a  “parallel”  enviroment,  and 
for  moderate  accuracy  requirements  can  be  replaced  by  a  fixed  composite  Gaussian  quadrature. 
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Abstract 

A  probabilistic  network  is  a  network  in  which  the  vertices  are  assumed  to  be  perfectly  reliable  and 
the  edges  have  independent  operational  probabilities.  The  ib-terminal  reliability  of  a  probabilistic 
network  is  the  probability  that  all  nodes  are  connected  in  a  selected  set  of  k  nodes.  We  investigate 
the  effectiveness  of  bounding  aU-  and  two-  terminal  reliability  using  surface  dualization  techniques. 
Dualization  heuristics  are  discussed,  and  some  computational  results  are  given. 


1  Introduction 

A  network  is  a  probabilistic  graph.  In  our  network  model  all  the  nodes  are  perfect  (i.e.  have 
operational  probability  1)  while  all  edges  have  a  fixed  operation  probability  in  the  range  [0, 1]. 
We  assume  that  the  operational  state  of  each  edge  is  independent  of  the  operational  states 
of  all  other  edges.  A  subset  of  two  or  more  nodes  is  selected  to  be  the  terminal  set  of  the 
network.  It  is  the  connectedness  of  these  terminals  that  reliability  measures. 

All-terminal  reliability  is  the  probability  that  the  network  is  connected  at  any  instant  of 
time.  Two-terminal  reliability  is  the  probability  that  two  terminal  nodes  s  and  t  are  always 
connected.  Computing  all-  or  two-  terminad  reliability  is  ^P-complete  even  for  planar  graphs 
[18].  Thus  many  polynomial  time  approximation  techniques  have  been  developed.  These 
techniques  can  be  divided  into  two  categories:  Monte  Carlo  methods  and  bounding  methods. 
Monte  Carlo  methods  result  in  an  approximation  with  confidence  intervals  while  bounding 
methods  produce  an  absolute  lower  or  upper  bound. 
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Bounding  methods  ue  of  two  main  types:  iterative  methods  and  static  methods.  Static 
methods  produce  a  single  bounding  value,  with  no  obvious  way  of  improving  the  produced 
bound.  Iterative  methods,  however,  allow  for  incremental  improvement  of  produced  bounds. 
An  important  class  of  iterative  methods  are  those  that  generate  the  most  probable  network 
states.  It  has  been  shown  that  static  methods  based  on  subgraph  extraction  and  network 
transformation  can  greatly  increase  the  efficiency  of  both  Monte  Carlo  and  most  probable 
states  methods  [7,  9].  Hence  improvements  on  static  methods  ue  desirable,  both  to  obtain 
better  bounds  and  to  accelerate  iterative  techniques. 

There  are  several  existing  all'  and  two-terminal  static  lower  bounds.  For  all-terminal 
bounds  of  networks  with  equal  edge  probabilities  there  are  Kruskal-Katona  [14]  and  Ball- 
Provan  [2].  Existing  two-terminal  static  lower  bounds  include  Kruskal-Katona  [14],  Chari- 
Provan  [4],  and  Series- Parallel  [1].  Kruskal-Katona,  Ball-Provan  and  Chari-Provan  can  only 
be  applied  to  networks  in  which  all  the  edge  probabilities  are  equal.  Computational  results 
support  the  observation  that  existing  upper  bounds  are  tighter  than  existing  lower  bounds. 
Hence  better  lower  bounds  are  very  desirable. 

For  planar  graphs  there  is  a  one-to-one  mapping  between  cutsets  of  a  graph  (primal)  and 
the  cyclic  subgraphs  of  its  dual  [19].  This  translates  to  a  linear  relationship  between  two- 
terminal  reliability  of  a  graph  and  the  two-terminal  reliability  of  its  a  —  1  dual  [22].  Our 
strategy  is  to  generalize  these  equalities  that  hold  for  planar  graphs  to  inequalities  which  hold 
for  non-planar  graphs  and  their  surface  duals. 


2  Definitions 

A  k-ierminal  network  G  =  {V,E,p,T)  consists  of  a  set  of  nodes  V,  a  set  of  undirected  edges 
E,  a  set  of  edge  operational  probabilities  p,  and  a  set  of  k  terminals  T.  A  two-terminal 
network  has  two  terminals  s  and  t-  For  all-terminal  networks  we  typically  omit  the  terminal 
set  specification.  We  refer  to  G  =  (V,E)  as  the  underlying  graph  of  the  network.  The  value 
Vij  €  p  represents  the  probability  that  edge  {t,j}  is  operational  at  any  instant  of  time.  It 
is  assumed  that  the  edge  operational  probabilities  are  independent.  Then  qij  =  1  —  Pij 
represents  the  probability  that  edge  {t,j}  is  non-operational  at  any  instant  of  time,  and 
p  =  €  E).  We  assume  that  the  network  is  simple  and  2-edge  connected,  since 

reliability  of  a  1-edge  connected  multigraph  can  be  linearly  reduced  to  the  calculation  of  the 
reliability  of  at  most  n  simple  2-connected  subgraphs.  In  this  paper  the  number  of  nodes  in 
a  network  is  denoted  by  n  and  the  number  of  edges  by  m. 

In  a  directed  k-terminal  network  G  =  {V,E,p,T),  E  contains  directed  edges  (arcs).  For 
directed  networks  one  terminal  is  denoted  as  the  source  s  while  all  remaining  terminals  are 
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destination  terminals.  A  directed  all-terminal  network  is  specified  &s  G  =  (K,  £,p,(s))  and  a 
directed  two-terminal  network  is  G  =  {V,E,p,(3,t)). 

A  pathset  is  a  subgraph  which  connects  all  terminals  (or  in  a  directed  network,  connects  the 
source  to  all  destination  terminals).  A  minimal  pathaet  is  a  pathset  which  does  not  properly 
contain  any  other  pathset.  Hence  an  s  —  t  minimal  pathaet  is  a  path  from  a  to  t  and  an 
all-terminal  minimal  pathaet  is  a  spanning  tree.  A  cutaet  is  a  subset  of  edges  that  separates 
one  or  more  terminals  from  each  other  (or  in  the  case  of  a  directed  network,  separates  the 
source  terminal  from  at  least  one  of  the  destination  terminals).  A  minimal  cutaet  does  not 
properly  contain  another  cutset. 

The  sets  of  all  pathsets,  cutsets  and  complements  of  pathsets  of  size  t  are  denoted  by  i/,, 
K{  and  Xi  respectively,  with  u  =  Uui,  k  —  U/Ci  and  x  =  The  numbers  of  pathsets,  cutsets, 
and  complements  of  pathsets  containing  t  edges  are  denoted  by  Ni,  Ci,  and  fi  respectively. 
It  is  easily  verified  that: 


= 


~  C'm-t 


N-  =  F 

^  m—t 


If  all  of  the  edge  probabilities  in  the  network  are  equal  to  p,  then  Rel\G\  can  be  expressed 
in  terms  of  the  Si's,  Ci'a  or  Pi’s  as  follows: 


Rel[G\  =  f;  fV.p-q"*-’  =  f;  F.p"-g’  =  1  -  £  C.q'p”'-' 

»=o  »=o  »=0 

Sometimes  to  emphasize  the  fact  that  we  are  interested  in  all-  or  two-terminal  reliability  we 
use  RcIaIG]  and  Reli\G\  in  place  of  Rel\G\. 

The  following  topological  definitions  are  needed  (see  Gross  and  Tucker  [10]). 

An  imbedding  of  a  network  on  a  surface  5  is  the  drawing  of  the  network  on  the  surface  such 
that  no  edges  of  the  network  cross.  A  S-cell  imbedding  is  an  imbedding  in  which  all  the  regions 
are  open  disks  or  cells.  All  imbeddings  in  this  paper  are  assumed  to  be  2-cell  imbeddings.  An 
orientable  2-cell  imbedding  is  a  2-ceU  imbedding  on  an  orientable  surface.  Spheres  and  planes 
(which  are  just  spheres  with  a  point  removed)  are  orientable  surfaces  of  genua  0.  For  t  >  1,  an 
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.  Face  Boundary 

Q . O  Cc=>£30 

Straight  Edge  Twisted  Edge 

Figure  1:  FaceTVadng  Along  Straight  and  Twisted  Edges 

orientable  surface  of  genus  i  can  be  represented  by  a  string  of  t  adjacent  tori,  or  equivalently 
as  a  sphere  with  i  loops  called  handles  added  to  it.  A  nononentable  2-cell  imbedding  is  a 
2-cell  imbedding  on  a  nonorientable  surface.  Spheres,  even  though  they  are  orientable,  are 
also  taken  to  be  nonorientable  surfaces  of  crosscap  0.  For  t  >  1,  a  nonorientable  surface  of 
crosscap  t  can  be  represented  by  a  sphere  with  i  holes,  each  hole  closed  by  attaching  to  it  a 
Mobius  band  (created  by  taking  a  strip,  twisting  it  once,  and  attaching  its  ends  together)  at  its 
boundary.  Orientable  imbeddings  contain  only  straight  edges  while  nonorientable  imbeddings 
contain  both  straight  and  tunsted  edges.  (See  Figure  1  ). 

Both  nonorientable  and  orientable  imbeddings  can  be  locally  oriented  in  that  a  clockwise 
or  counterclockwise  direction  can  be  associated  with  each  node  of  the  graph.  Straight  edges 
having  like  oriented  endpoints  and  twisted  edges  having  opposing  oriented  endpoints  are  type-0 
edges.  Similarly,  straight  edges  having  opposing  oriented  endpoints  and  twisted  edges  having 
like  oriented  endpoints  are  type-1  edges.  In  this  paper  we  will  assume  that  every  vertex  is 
locally  oriented  in  a  counterclockwise  direction  in  all  imbeddings.  Thus  straight  edges  will 
always  be  type-0  and  twisted  edges  will  always  be  type-1. 

A  surface  (topological)  dual  Gf  of  a  graph  G  with  respect  to  an  imbedding  I  is  defined  in 
the  same  way  as  a  planar  dual.  Each  region  of  /  is  a  vertex  in  Crf  and  edge  e  of  the  graph  is 
added  between  nodes  t  and  j  in  Gf  if  e  is  common  to  the  boundaries  of  regions  t  and  j  in  G. 
The  surface  dual  of  an  all-terminal  network  is  the  surface  dual  of  its  underlying  graph  with 
edge  operational  probabilities  of  the  dual  is  p. 

A  generalization  of  an  s  —  t  dual  [22]  can  be  defined  for  any  2-connected  simple  two- 
terminal  network  G  =  {V,E,p,{s,t})  as  follows.  Find  some  circuit  containing  both  s  and 
t,  C  =  (s,Vj,V3,...Vj_i,f,Vj+i...Vfc)  (such  a  circuit  must  exist  since  G  is  2-connected).  Add 
edge  c„  =  {s,t}  to  G.  Adding  e,t  to  G  creates  two  circuits  Gi  =  (s, Uj,  V3, . . . ,  Vj-i,  t)  and 
G2  =  {tjUn-i,. . . ,  vs,s).  One  can  then  find  an  imbedding  for  the  underlying  graph  of  network 
GU  Ch  such  that  these  two  cycles  form  region  boundaries  [23].  The  edge  probabilities  in  G)^, 
are  p  and  the  terminals  s'  and  t'  of  G^,  are  the  dual  nodes  corresponding  to  the  regions  whose 
boundaries  are  Ci  and  Cj  respectively.  Thus  G^,  =  {s',t'}). 

For  a  directed  network  G  we  assign  directions  to  the  dual  arcs  as  follows:  if  arc  e  is  directed 
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clockwise  around  region  iZl  and  thus  counterclockwise  around  region  B2  then  in  the  dual  arc 
eP  is  directed  &om  node  iil  to  node  R2.  Furthermore  for  all  arcs  e^  =  (i,j)  €  ^  j,  add 

arc  e'^  =  (j,*)  to  with  p,'  =  1. 

If  G  is  a  two-terminal  directed  network,  we  direct  e,t  from  s  to  t  and  ensure  that  e,t  is 
directed  counterclockwise  around  C\  and  clockwise  around  C^.  We  also  delete  all  arcs  directed 
into  s'  and  ail  arcs  out  of  t'.  The  s  —  t  dual  of  a  directed  network  G  with  respect  to  some  /.t 
imbedding  is  denoted 

One  algebraic  method  of  representing  a  2-cell  imbedding  is  a  rotation  system.  Given  a 
2-cell  imbedding  /,  the  rotation  system  representing  1  consists  of  a  rotational  vector  r,  for 
each  vertex  t.  Each  r,  contains  entries  for  edges  incident  to  vertex  i  in  the  order  they  are 
encountered  when  making  a  circulation  around  i  consistent  with  its  local  orientation  (which  in 
this  paper  is  always  counterclockwise).  If  an  edge  is  of  type-1  we  add  a  superscript  1  to  it  when 
including  it  in  the  rotation  system.  There  is  a  1-1  mapping  between  rotation  systems  and 
locally  oriented  graph  imbeddings  (up  to  equivalence  of  imbeddings).  It  is  a  very  simple  matter 
to  determine  the  faces  of  an  imbedding  and  thus  its  associated  dual  from  its  rotation  system 
representation.  Finally  a  rotation  system  containing  only  type-0  edges  is  called  pure.  There 
is  a  1-1  correspondence  (up  to  equivalency  of  imbeddings)  between  orientable  imbeddings  and 
pure  rotation  systems.  For  more  details  see  [10]. 


3  Bounds  via  Duality 

This  first  proposition  is  due  to  Richter  and  Shank  [19]; 

Proposition  1  Let  G  =  (V,  E)  be  a  simple  undirected  graph  and  let  /  denote  any  imbedding 
of  G  on  some  surface  S.  Let  Gf  =  iR,E)  be  the  dual  of  G  with  respect  to  I.  Let  C  be  a 
minimal  cut  of  G  and  its  corresponding  subgraph  in  Gf. 

Then  the  degree  of  every  vertex  in  G®  is  even,  i.e.  C  contains  an  even  number  of  boundary 
edges  from  every  region  in  G. 

Define  a  cycle  as  a  loop  or  closed  path  and  a  circuit  to  be  a  connected  set  of  one  or  more 
edge  disjoint  cycles. 

Corolluy  3.1  G^  can  be  expressed  as  a  set  of  circuits. 

This  follows  from  the  fact  that  all  vertices  in  Cp  even  degree.  Thus  the  number  of  cyclic 
subgraphs  of  size  i  in  Gf  is  an  upper  bound  for  the  number  of  cutsets  of  size  t  in  G. 

Let  Spi(G)  indicating  the  number  of  spanning  forests  on  t  edges  of  G. 
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Corollary  3.2  For  an  all-terminal  undirected  network  G,  Fi  >  Spi{G^). 

This  follows  from  the  fact  that  all  network  cutsets  of  size  t  map  to  cyclic  subgraphs  of  size  t 
in  the  dual. 

The  corollary  indicates  that  for  all-terminal  networks  with  equal  edge  operational  proba¬ 
bilities  it  may  be  possible  to  improve  the  lower  bound  on  some  of  the  Fi’s.  An  algorithm  of 
Liu  and  Chow  [15]  can  be  used  to  compute  the  number  of  k  component  spanning  forests  in 
time  polynomial  in  m  but  exponential  in  k  .  Therefore  if  there  are  nj  vertices  in  the  topolog¬ 
ical  dual,  we  can  efRciently  compute  lower  bounds  for  Fn^-\,—,  F„^-k  where  k  is  some  small 
integer  constant.  If  any  of  these  are  greater  than  the  lower  bounds  produced  by  conventional 
Kruskal-Katona  methods,  they  can  be  used  instead  to  produce  a  better  lower  bound.  Similarly, 
if  any  of  these  F;  lower  bounds  are  better  than  the  lower  bounds  produced  by  the  BaU-Provan 
bounds,  we  can  determine  a  better  BaU-Provan  lower  bound  using  the  improved  F{  lower 
bounds  (See  Section  4).  For  more  information  about  the  Kruskal-Katona  and  BaU-Provan 
techniques  see  [5|. 

We  now  turn  our  attention  to  two-terminal  networks. 

Proposition  2  An  s  —  t  cut  C  in  an  3  —  t  network  G  =  (V,£,p,  {s,t})  U  e,i  is  a  cyclic 
subgraphC^  in  any  s  —  t  dual  Gf^^  =  (V^ ,  ofG.  Furthermore  s'  and  t'  both  lie 

on  some  common  circuit  of  . 

Corollary  3.3  is  an  s'  —  t'  pathset  in  Gf^^  —  e„. 

By  the  above  proposition  any  s  —  t  cut  (7  in  G  U  e,t  forms  a  cycUc  subgraph  in 
where  s'  and  t'  are  contained  on  some  circuit.  C  =  C  —  e,t  is  a  cut  in  G  and  is  a  subgraph 
olGl-e^,  in  which  s'  and  t'  are  connected. 

From  this  point  on,  Gf],  refers  to  the  topological  s  —  t  dual  with  edge  efj  removed. 
Corollary  3.4  Let  G  =  {V,E,p,{s,t})  be  a  two-terminal  network  and  let 

Gt^(V^,E^,p,W,t'}) 

be  any  s  —  t  dual  of  G.  Then  Reli\G\  >  1  —  J?elj[Gf]J. 

Let  C  =  {Gi,...,Gs}  be  the  set  of  j  —  t  cutsets  of  G  and  P  =  {Pi,Pj,...,P,}  be  the 
set  of  s'  —  t'  pathsets  of  G®,.  By  coroUary  1,  G  C  P.  Thus  ]Cf=i  FlecCi  ^  Sr=i  OteP. 

1  -  53*=i  flceCi  >  1  -  I^*=i  n«€Pi  9*  since  tlie  sums  are  between  0  and  1. 

Thus  Reli[G]  >  1  — Pe/i[G®J.  This  implies  that  a  lower  bound  for  Pclj(G]  can  be  obtained 
from  an  upper  bound  for  PelifG^J. 
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Proposition  3  Let  G  =  (V,  fJ,p,  he  a  directed  2-terminal  network  and 
any  directed  s  —  t  topological  dual  of  G.  Then  the  Reli[G]  —  Relj[G^]. 


4  All-terminal  Implementation 

A  major  stumbling  block  in  implementing  this  bounding  technique  is  to  find  an  imbedding 
that  produces  a  large  number  of  spanning  forests  in  the  dual.  Here  we  examine  the  suitability 
of  this  method  using  orientable  imbeddings.  We  first  attempt  to  sample  a  small  percentage  of 
pure  rotation  systems  randomly  and  select  the  one  producing  the  dual  with  the  most  spanmng 
trees.  The  higher  the  genus  of  the  imbedding,  the  higher  the  bkelihood  of  a  reduction  in  unique 
spanning  forests  due  to  loops  and  multiple  edges.  Unfortunately,  tlie  number  of  high  genus 
imbeddings  in  our  test  cases  far  exceeds  the  number  of  Lw  genus  imbeddings.  Thus,  random 
sampling  does  not  appear  to  be  very  productive.  Consequently,  we  develop  a  heuri  'tic  which 
attempts  to  construct  low  genus  imbeddings  (imbeddings  with  lots  cf  regions)  by  ensuring 
that  a  maximal  set  of  cycles  become  faces  in  the  imbedding.  The  heunstic  is  as  follows: 

Let  G  —  {V,E,p).  Given  a  subgraph  5,  let  edges{S)  denote  the  set  of  edges  in  5 

1.  Find  a  minimal  cycle  Ci  in  G.  Set  E^  =  edges’  Ci).  Set  E^  =  E  —  E„.  Set  0  =  {Gi}. 

2.  For  each  e  =  (fj,  dj)  €  Eu  find  if  possible  a  shortest  path  P  from  vi  to  vj  in  If  such 

a  path  is  found,  set  E^  =  Ea  -  edges{P),  £„  =  +  €dges{P),  and  0  =  0  U  {P  +  e}. 

In  any  case,  set  —  {e}.  Repeat  this  step  until  either  £„  =  0  or  £„  =  0. 

3.  If  £„  =  0  but  ^  0  then  find  a  cycle  C  in  E^-  If  such  a  cycle  is  found  then  set 
Ea  =  Ea  -  edges{C),  E^  =  edges{C),  0  =  0  U  {C}  and  repeat  step  2. 

4.  Find  a  rotation  system  for  the  subgraph  5  =  ( V,  P„)  that  ensures  the  circuits  in  0  form 
face  boundaries.  This  is  always  possible  because  Ui<c,<|e|C'i  defines  a  planar  imbedding. 

5.  If  £a  ^  0  ffic  remaining  edges  in  £„  &t  the  end  of  their  respective  incident 

vertices’  rotation  vectors. 

The  cycles  found  in  step  2  become  faces  in  the  imbedding  defined  by  the  rotation  system 
constructed  above.  Thus,  if  k  cycles  are  found  by  the  above  algorithm,  we  are  assured  of  at 
least  h  +  1  faces  in  the  produced  imbedding.  In  the  final  paper,  we  present  computational 
results  using  this  heuristic. 
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5  Two-terminal  Implementation 

Dualization  results  for  two-termioal  reliability  lower  bounds  appear  to  be  much  better  than 
those  for  all-terminal  reliability.  For  two-terminal  reliability,  we  have  a  direct  correlation 
between  the  reliability  of  the  dual  and  the  reliability  of  the  network.  Here  we  examine  a 
heuristic  which  generates  an  orientable  imbedding. 

Once  again  the  major  difRculty  in  implementing  this  bounding  technique  is  to  select  a 
suitable  a  —  t  imbedding.  As  in  the  all-terminal  case,  we  wish  to  minimize  the  genus  of  our 
produced  imbedding.  However,  minimizing  the  genus  is  not  sufficient  to  ensure  that  a  good 
imbedding  is  produced.  The  longer  the  minimum  a'  —  t'  path  in  the  dual,  the  better  the 
produced  bounds.  We  therefore  employ  the  following  modified  version  of  the  all-terminal 
imbedding  heuristic. 

Let  G  =  (V,  E,p,  {s,t}).  Given  a  subgraph  S,  let  edpes(5)  denote  the  set  of  edges  in  S. 

1.  Find  the  shortest  s  —  t  circuit  C  '\n  G  (for  example,  via  a  mincost  flow  algorithm),  and 
add  edge  e,t  =  (s,t)  to  G  to  get  the  two  circuits  C\  and  Cf  C\  and  Gj  become  s'  and  t' 
in  Gf^^.  Set  =  edgea{Ci)  U  edgea{Ci).  Set  Ea  =  E  —  Eu  —  Set  ©  =  {Ci,Ci}. 

2.  For  each  e  =  (vi,  V])  €  find  if  possible  a  shortest  path  P  from  Vi  to  Vj  in  E,.  If  such 

a  path  is  found,  set  E^  —  E^  —  cd5es(P),  edgea(P),  and  0  =  0  U  {P  -f  e}. 

In  any  case,  set  E^  =  E^  —  {c}.  Repeat  this  step  until  either  £„  =  0  or  P,  =  0. 

3.  If  =  0  but  Ea  ^  ^  then  find  a  cycle  C  in  Ea.  If  such  a  cycle  is  found  then  set 
Ea  =  Ea  -  cdgea{C),  £„  =  €dgea{C),  ©  =  ©  U  {C}  and  repeat  step  2. 

4.  Find  a  rotation  system  fot  the  subgraph  5  =  (V,  E„)  that  ensures  the  circuits  in  0  form 
face  boundaries.  This  is  always  possible  because  Ui<cr.<|e|G,’  defines  a  planar  imbedding. 

5.  If  Pa  0  then  add  the  remaining  edges  in  Ea  at  the  end  of  their  respective  incident 
vertices’  rotation  vectors. 

This  heuristic  for  imbedding  gives  an  efficient  implementation  of  the  bound  in  proposition 
3.  In  the  final  paper,  we  explore  the  accuracy  of  the  resulting  bound  compared  to  currently 
available  methods. 

6  Conclusions 

Surface  dualization  techniques  deliver  bounds  that  are  competitive  with  the  current  best 
methods.  Furthermore,  they  can  be  utilized  for  state  space  reduction  in  Monte  Carlo  and 
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Most  Prob&ble  State  methods.  In  the  final  paper,  these  applications  are  discussed  further. 
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In  this  paper  we  discuss  a  location-distribution  problem  and 
its  implementation  for  a  beer  brewer  company.  The 
company  has  three  beer  breweries,  two  malt  factories,  three 
product  types  and  about  270  customer  zones.  We  consider  15 
candidate  locations  for  the  new  breweries  to  be  established  in 
the  near  future.  We  study  the  current  distribution  plan  and 
evaluate  the  alternative  locations  for  the  new  breweries. 

We  develop  two  models  to  solve  the  problem.  In  both 
models  we  aggregate  different  types  of  beer  into  a  single  type 
as  the  transportation  cost  differences  between  different  types 
are  small.  This  aggregation  reduces  the  problem  size  and  it 
makes  it  easier  to  manage  on  PC's.  We  also  excluded  the 
fixed  cost  of  construction  from  the  models  as  this  cost  did  not 
vary  much  among  alternative  locations. 
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The  first  model  is  a  mixed  integer  programming  model  that 
considers  the  transportation  costs  of  malt  to  the  breweries 
and  beer  to  the  demand  points.  The  model  solves  for  the 
distribution  plan  of  malt  and  beer,  and  the  locations  of  new 
breweries.  In  the  second  model,  in  addition  to  the  above,  we 
incorporated  inventory  carrying.  The  seasonality  of  demand 
is  an  important  issue  in  beer  consumption  and  this  brings 
serious  implications  on  the  amount  of  inventory  carried. 
The  effect  of  high  inflation  rate  in  the  economy  also 
magnifies  the  importance  of  carrying  inventories.  In  this 
case  the  model  becomes  a  multi-period  model  where  months 
represent  periods. 

The  application  of  the  models  is  done  in  four  modules:  the 
data  manipulation  module  in  Lotus  123,  the  model 
generation  module  in  Fortran,  the  solution  in  LINDO,  and 
the  reporting  module  in  Fortran.  The  program  is  designed 
so  that  different  applications  of  models,  such  as  clustering  of 
the  customer  zones,  varying  the  number  of  product  types, 
and  including  the  fixed  costs  of  new  establishments  are 
possible. 

A  series  of  runs  for  both  models  is  executed  on  an  IBM  486 
compatible  computer.  Each  run  takes  several  minutes.  Since 
the  estimation  of  the  true  inventory  holding  cost  is  not 
straightforward,  we  represent  the  trade-off  between 
transportation  and  inventory  expenses  by  restricting  the 
amount  of  budget  tied  up  in  inventory.  We  solve  the 
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problem  a  number  of  times  by  setting  this  budget  to  different 
levels  and  plot  the  transportation  cost  vs.  inventory  budget 
which  is  a  compatible  basis  for  comparing  the  two  cost  items. 

We  present  the  results  of  our  study  and  discuss  the 
implementation  under  several  scenarios.  The  results 
obtained  by  the  model  have  been  found  useful  by  the 
management  and  they  decided  to  locate  their  plant  at  the 
location  suggested  by  oiu-  study. 

Keywords:  Location,  distribution 
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ABSTRACT 

Forward  discrete  dynamic  programming  was  used  to 
optimise  the  pollution  abatement  effort  along  a  river  basin, 
through  the  adequate  location  and  operation  of  treatment 
plants,  at  minimum  cost.  Upper  and  lower  bounds  were  set  in 
terms  of  efficiency  of  pollution  removal,  which  is  the 
'semi-independent'  variable,  in  order  to  bound  the  solution 
in  a  small  neighbourhood.  The  efficiencies  had  to  be 
adequately  translated  into  the  independent  variable,  the 
liver  water  quality  standard  (measured  in  mg/1  of  BOD, 
Biochemical  Oxygen  Demand) ,  since  it  determines  whether  the 
value  obtained  is  a  feasible  solution  or  not. 


Keywords:  dynamic  programming,  global  optimisation. 


interval  approach. 
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1.  INTRODUCTION 

This  paper  presents  part  of  a  major  research  undertook 
in  order  to  develop  a  computerized  framework  to  assist  in 
the  management  of  a  river  basin  industrial  wastewater 
system.  It  is  applicable  to  rivers  following  a  Dobbins  type 
model  and  the  quality  parameters  under  surveillance  are  the 
Biochemical  Oxygen  Demand  (BOD)  and  the  Dissolved  Oxygen 
(DO) .  WODA  commercial  package  was  used  and  adapted  for  the 
attainment  of  4  BOD  and  1  DO  standards,  using  dynamic 
programming,  the  subject  of  this  paper.  Geometric 
programming  was  used  to  select  the  minimum  cost  preliminary 
treatment  plants'  design.  Other  pollution  abatement  measures 
considered  a.re  flow  augmentation  and  artificial  aeration.  .A 
'compromised'  minimum  cost  solution  between  a  completely 
centralised  allocation  of  pollution  abatement  effort  and  a 
single  company  most  economical  solution  is  determined.  The 
number  of  concentrated  effluent  discharge  points  (n) 
determine  the  number  of  stages  or  reaches  being  analysed 
(n-1) ,  one  at  a  time. 

The  objective  function  to  be  minimised  is  the  cost  of 
the  pollution  .ibatement  measure.?,  which  is  a  sum  of  a 
maximum  of  three  terms,  depending  on  the  number  of  different 
types  of  measures  being  considered  (treatment  plants, 
artificial  aeration,  and  low  flow  augmentation).  Four  BOD 
standards,  8.0,  6.5,  5.0,  3.5  mg/1,  and  one  DO  standard  5.0 
mg/1,  were  tested.  These  include,  as  middle  and  lower 
values,  the  EEC  standards.  The  optimisation  problem  was 
solved  using  a  forward  dynamic  programming  procedure  which 

i 
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evaluated  the  minimum  cost  abatement  efficiency. 

2.  OPTIMISATION  PROCEDURE 

2.1.  Preliminary  Approaches 

The  simulation  routine  would  run  for  the  initial 
conditions  of  the  data  to  check  if  any  violation  of  each 
standard  occurs.  When  it  happens,  the  concentration  of  BOD 
in  the  concentrated  discharge  immeadiately  upstream  is 

reduced.  The  first  reduction  is  35%  for  technological 
reasons,  and  after  that  the  reduction  is  done  in  steps  of 
5%,  until  no  violation  occurs.  If  the  violation  is  mainly  of 

the  DO  standard,  then  artificial  aeration  can  be  tested.  If 

the  improvement  is  not  enough,  or  if  the  violation  is  not 
only  on  the  DO  level,  then  flow  augmentation  can  be  studied 
to  couple  with  treatment. 

The  cost  of  complying  with  each  standard  in  each  reach 
is  calculated  and  a  variable,  ROOT,  depending  on  the  reach 
and  on  the  standard  is  stored  representing  the  pathway 

taken.  Indeed  such  a  variable  characterises  a  node  in  the 
dynamic  programming  algorithm  being  implemented.  It  was 
called  NODE.  The  position  and  the  value  of  the  node's  digits 
give  information  .ibout  the  read)  number  and  the  standard 
being  attained,  respectively.  The  node  structure  is: 

Node  digits  position  1st  2nd  3rd  4th  . . . 

Reach  number:  1  2  3  4 

Node  digits  value  1  234 

Standard  codes;  8.0  6.5  5.0  3.5 

Two  procedures  were  developed  for  the  analysis  of  the 
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reaches.  One  of  them  is  based  on  the  assumption  that  any 
standard  can  be  achieved  in  any  reach,  independently  of  the 
standard  achieved  in  the  previous  reach  or  reaches  (case  A) . 
The  other  procedure  (case  B)  presupposes  that  it  is 
reasonable  to  assume  that  any  reach  downstream  will  not  be 
required  to  comply  with  a  tighter  standard  than  reaches 
upstream.  In  other  words,  this  case  evaluates  the  minimum 
cost  of  achieving  at  least  a  certain  quality  level  in  all 
reaches  upstream.  Results  obtained  in  both  cases  with  test 
data  will  follow. 

(1) 

Case  A.  The  following  final  results  were  obtained: 


RESULTS  .AFTER  REACH  5  (last  reach) 


BOD  Efficiency  of 

BOD  removal 

Stand. 

(reach  i,  i 

=1, reach) 

Node 

Cost 

(mg/1) 

(%) 

(USS  1000) 

8.00  0 

;  75  ;  0  ; 

0  ;  0 

23211 

416.27 

6.50  0 

;  75  ;  0  ; 

0  ;  0 

23212 

416.27 

"  0 

;  75  ;  0  ; 

75  ;  0 

23233 

832.53 

3.50 

impossible 

xzty 

4 

"i 

Analysing  this  final  result  we 

can  say  that  only  two 

tandards , 

BOD<8.0  and  6.5  mg/1 

are 

being  achieved 

everywhere,  which  is  far  from  the  objective  of  the 
simulation.  This  network  corresponds  to  the  most  common 


dynamic  programming  structure 


(1)  All  the  costs  are  reported  at  1969  values,  and  the  total 
number  of  reaches  tested  was  5. 
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Another  constraint  should  be  tested  and  carried  out  at 
every  stage  of  the  dynamic  programming  procedure  -  the  cost 
of  attaining  at  least  every  standard  in  all  reaches 
upstream,  which  wasthe  single  objective  of  case  B. 

Case  B.  The  following  final  results  were  obtained: 


RESULTS  AFTER  REACH  5  (last  reach) 


BOD 
Stand . 
(mg/1) 

Efficiency  of 
(reach  i,  i 
(%) 

BOD  removal 
=1 , reach) 

.Mode 

Cost 

(USS  1000) 

8.00 

35 

;  0  ;  0  ; 

35 

0 

31111 

536.43 

6.50 

0 

;  40  ;  40  ; 

35 

0 

22222 

824.76 

5.00 

45 

;  70  ;  35  ; 

45 

0 

44333 

1230.27 

3.50 

impossible 

.\2ty  4 

- 

i 


Bv  considering,  for  instance  the  node  44333  we  can  see 
that  the  minimum  cost  of  complying  with,  at  least  standard 
5.00,  (number  3,  in  the  node)  was  obtained  when  in  reaches  1 
and  2  a  tighter  standard  (3.5  -  number  4,  in  the  node)  was 
obeyed.  So,  in  spite  of  treating  more  than  necessary,  a 
smaller  overall  cost  was  obtained.  Or,  in  other  words,  the 
more  than  necessary  pollution  reduction  in  c  certain  reach 
can  result  in  an  overall  reduction  in  cost  by  avoiding  the 
need  of  action(s)  downstream.  This  network  corresponds  to 
Jhe  simplified  dynamic  programming  structure 

1234 

Reaches 
Standards 
8.00 

€.50 

5.00 

3.50 

Conclusion:  the  constrained  solution,  case  B,  has  the 
advantage  of  allowing  us  to  specify,  at  any  stage,  the 
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minimum  level  of  BOD  attained  in  any  point  upstream.  This 
is,  in  fact,  the  last  standard  to  be  checked,  assuming  that 
the  optimisation  procedure  starts  with  the  less  tight 
standard.  Case  A,  was  discarded,  provided  adequate 
constraints  were  added  to  the  formulation  of  Case  B. 

Ke  notice  that  standard  3.5  has  not  been  achieved  in 
every  reach,  the  solution  obtained  presenting  it  as 
impossible  to  attain.  This  fact  would  oblige  us  to  study  the 
use  of  flow-augmentation  to  couple  with  treatment,  from  the 
reach  onwards  where  treatment  alone  became  insufficient.  The 
overall  cost  would  certainly  rise  very  substantially. 

2.2.  Charging  Schemes 

Several  methods,  without  using  any  optimisation 
algorithm  were  tried,  in  order  to  compare  different  charging 
schemes  for  the  usual  four  BOD  standards  being  considered. 
These  were  a  minimum  treatment  method,  minimum  treatment 
method  but  enforcing  primary  treatment  in  every  reach,  and 
a  constant  removal  method. 

The  results  obtained  are  sumarised  below,  in  Table  1.. 
.^s  we  can  see  the  closest  method  to  the  minimum  cost  is  the 
Minimum  Treatment  Method. 

2.3.  Refined  Approach 

In  fact,  the  Equal  Treatment  Method  can  be  considered  an 
upper  limit  for  the  minimum  cost  solution,  as  far  as  cost 
itself  is  concerned.  The  same  does  not  apply  to  removal 
efficiencies,  which  have  been  the  'semi-independent' 
variables  in  the  optimisation  procedure  above.  The  really 
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independent  variables  are  the  standards.  Khen  comparing  the 
performance  of  the  different  charging  schemes  it  was  noticed 
that  standard  3.5  can  be  achieved  by  treatment  only, 
requiring  an  average  removal  rate  of  75%.  This  fact  proved 
that  the  above  optimisation  was  wrong  and  needed  to  be 
altered.  Consequently,  the  optimisation  procedure  was  re¬ 
studied.  The  problems  involved  with  discrete  dynamic 


programming  of  this  type,  are  very  closely 

related  to 

the 

combinatorial  aspect  of  the 

resulting  steps 

In  order 

to 

reduce  them  it  is  advisable  to 

use  upper  and 

lower  limits 

,  if 

known,  to 

bound  the  solution  in  a  small  neighbourhood. 
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X  N 
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•1 
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X  X 

X  1799.33 

117 

1 

\  X 

X  \  N 

;  536.42 

M'ntaium  2 

X  .X 

X  X 

N  836.13 

Cost  3 

\  N 

X  >; 

N  1051.56 

1 

X  X 

X  \ 

X  1532.96 

As  we  have  seen,  the  Equal  Treatment  Method  would  be  a  good 
upper  limit  if  the  cost  was  the  independent  variable.  Using 
the  simulation  routine  we  can  change  the  efficiency  of 
removal  values,  by  a  predefined  5%  value,  which,  in  turn. 
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defines  the  discrete  grade  used. 

We  should  be  looking  for  an  upper  bound  on  the 
efficiency,  since  we  know  that  treating  more  than  necessary 
upstream  may  result  in  lower  overall  costs.  If  we  knew  that 
the  optimum  would  be,  say,  limited  by  85%  efficiency  of 
removal,  then,  for  reach  1  we  could  start  analysing  the 
minimum  efficiency  needed  to  comply  with  the  standard  in 
that  reach.  All  other  upper  efficiencies  up  to  85%  would  be 
possible,  at  a  higher  cost.  When  proceeding  to  reach  2  the 
same  analysis  would  be  done,  for  each  of  the  possible 
outcomes  from  reach  1.  The  minimum  cost  solution  would  be 
selected,  after  ensuring  that  it  could  lead  to  a  feasible 
solution,  by  which  we  mean  a  node  which  can  lead  to  the 


achievment  of 

the 

standard 

until  the  mouth  of 

the 

river, 

Lacking 

real 

data  for 

the  desired 

upper 

litriit 

,  we 

used 

the  ma.ximum 

possible  efficiency  of 

99%, 

and 

two 

lower 

efficiencies. 

in 

order 

to  split  the 

possible 

space . 

and 

reduce  iterations  for  some  of  the  standards.  One  of  the 
efficiencies  was  95%,  since  it  is  the  limit  for  secondary 
treatment,  and  the  other  was  75%,  for  no  other  reason  than 
being  the  average  value  obtained  by  the  Equal  Treatment 
Method . 

An  important  correspondance  still  to  be  established  was 
the  need  to  translate  the  efficiencies  into  standards,  since 
these  determine  whether  the  value  obtained  is  a  solution  or 
not.  The  main  program  was  altered  to  allow  for  three 
introductory  runs  of  the  simulation  routine  for  a  constant 
removal  rate  75%,  95%  and  99%,  for  each  of  the  concentrated 
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discharge  BOD  concentrations.  The  maximum  BOD  value  obtained 
in  each  run  is  automatically  selected  and  stored  as  a 
'slack'  or  'dummy*  standard.  Thus,  the  attainment  of 
standard  8.0,  for  instance,  will  now  have  6  'ceilings'  to 
test  in  every  reach.  The  complete  diagram  is  shown  below. 


Reach  Number 


The  optimisation  proceeds  by  checking  every  solution 
for  feasibility,  by  replacing  downstream  BOD  concentrated 
effluent  values  for  the  minimum  possible  (99%  reduction)  and 
running  the  simulation  routine.  Only  if  the  intermediate 
solution  being  determined  passes  this  test,  will  it  be 
stored.  A  ne*  routine  was  written  for  tliis  purpose.  If  no 
solution,  for  a  certain  standard,  is  achieved,  then 
treatment  is  not  enough  and  has  to  be  coupled  with  flow 
augmentation.  The  final  results  obtained  are  shown  below: 

The  cost  and  efficiency  values  are  not  completely 
equivalent  with  those  found  previously,  also  due  to  a  slight 
difference  in  the  value  of  the  quality  parameters. 
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Corrective  action  was  taken,  mainly  by  rewriting  the  proper 
software  routines. 


RESULTS  AFTER  REACH  5 


BOD 

Stand. 

(mg/1) 

Efficiency  of  POD  removal 
(reach  i,  i=I. reach) 

(%) 

Node 

Cost 

(USS  ICOO) 

8.00 

35 

;  0  ;  0  ;35  ;  0 

31111 

536.42 

6.50 

45 

;40  ;  0  ;35  ;  0 

43222 

836.13 

5.00 

75 

;55  ;  0  ;55  ;  0 

54333 

1,051.50 

3.50 

75 

;75  ;55  ;70  ;  0 

55444 

1,532.96 

3,  CONCLUSION’S 

Since  the  research  was  conducted  on  Portuguese  data, 
the  optiniisat ion  procedure  based  on  several  standards  was 
devised  in  order  to  give  decision-makers  comparative  costs. 
This  would  allow  them  to  select  the  proper  choice,  for 
instance  during  the  legal  adaptation  period  to  EEC 
legislation,  of  a  slackened  standard  having  less  drastic 
c£fect.s  in  an  already  weakened  economy. 

It  is  known  that  the  optimum  can  be  improved  by 
refining  the  grade.  However,  by  using  the  simulation  routine 
wliicli  increases  the  computing  time  considerably  tor  any  new 
standard  tested,  as  well  as  the  use  of  the  available  cost 
functions  and  preliminary  treatment  plant  lesig-.-.s,  is 

thought  that  a  sufficient  degree  of  accuracy  is  achieved. 
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Abstract 

This  paper  presents  a  review  of  the  current  literature  on  the  branch 
of  multi-criteria  decision  modelling  known  as  Goal  Programming  (GP). 
The  result  of  our  indepth  investigations  of  the  two  main  GP  methods, 
lexicographic  and  weighted  GP  together  with  their  distinct  application 
areas  is  reported. 

Some  guidelines  to  the  scope  of  GP  as  an  application  tool  are  given 
and  methods  of  determining  which  problem  areas  are  best  suited  to  the 
different  GP  approaches  are  stated.  The  correlation  between  the  method 
of  assigning  weights  and  priorities  and  the  standard  of  the  results  is  also 
ascertained. 

Key  Words:  Goal  Programming,  Lexicographic,  Weighted 


1  Introduction 

Goal  Programming  is  a  branch  of  multi-criteria  decision  analysis.  It  was  first 
introduced  hy  Charnes  and  Cooper  in  1955  [1];  more  explictly  defined  by  the 
same  authors  in  1961  [2];  and  further  developed  by  Ijiri  [3]  during  the  1960’s. 
The  first  books  dedicated  to  GP  by  Lee  [4]  and  Ignizio  [5]  appeared  during  the 
early  to  mid  1970’s. 

In  the  1970’s  GP  and  its  variants  were  applied  to  many  different  subject 
areas.  These  include  academic  resource  planning  [6,  7],  accounting  [8],  agricul¬ 
tural  planning  [9],  energy  forecasting  [10],  portfolio  management  [4,  11],  water 
resource  planning  [12],  library  management  [13],  and  media  scheduling  [14]. 

^  Address  for  Correspondence;  Dr.  M.  Tamiz,  School  of  Mathematical  Studies,  University 
of  Portsmouth,  Mercantile  House,  Hampshire  Terrace,  Portsmouth,  POl  2EG,  England. 
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Questions  were  raised  as  to  the  effectiveness  of  GP  as  an  application  tool 
by  Zeleny  [15]  and  Harrald  [16]  during  the  late  1970’s  and  early  1980’s,  but  GP 
still  grew  in  popularity  judging  by  the  increase  of  papers  applying  GP  during 
the  1980’s.  Table  1  shows  the  number  of  papers  on  the  subject  of  GP  (both 
theoretical  and  applicational)  during  the  past  decade,  in  the  Journal  of  the  Op¬ 
erational  Research  Society,  which  may  well  be  considered  to  be  a  representative 
sample  of  GP  publications. 


Table  1  :  Frequency  of  GP  papers  in  the 
Joumi  of  the  OR  Society 


Year 


Total 

Theoretical 

Application 


The  results  show  a  continuing  healthy  interest  in  GP.  Among  the  application 
areas  utilised  or  extended  in  the  past  ten  years  are  farm  growth  planning  [17], 
diet  planning  [18,  19],  locational  analysis  [20,  21],  academic  resource  planning 
[22,  23],  manpower  planning  [24],  police  scheduling  [25],  portfolio  analysis  [26], 
interest  rate  models  [27],  engineering  [28],  and  manufacturing  [29]. 

With  the  onset  of  powerful  computers,  sophisicated  algorithms  have  been 
developed  by  Ignizio  [30],  Schniederjans  and  Kwak  [31],  and  others  [32,  33,  34]. 
Olson  [35]  compares  computational  time  for  four  GP  algorithms  and  demon¬ 
strates  the  benefits  of  using  Revised-Simplex  and  Primal-Dual  algorithms  to 
solve  GP  problems.  These  have  made  solutions  to  large  scale  GP  problems 
possible  and  several  papers  have  been  published  exploiting  this  [24,  25,  36]. 
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Work  has  also  continued  into  special  case  GP  algorithms:  Integer,  Zero-One, 
Fuzzy,  Interactive  and  Chance-Constrained.  A  breakdown  of  publications  in 
these  areas  is  given  in  Romero  [37].  In  total  he  lists  355  papers  dealing  with 
GP  applications  in  26  distinct  areas. 

Research  has  been  done  to  apply  other  Multi-Criteria  and  Management  Sci¬ 
ence  techniques  to  Goal  Programming.  These  include  interactive  multi-criteria 
methods  [38],  ‘Delphi’  techniques  [39,  40],  Saaty’s  [41]  analytical  hierarchy  ap¬ 
proach  [36,  23,  39],  and  resource  planning  and  management  systems(RPMS) 
networks  [42].  Recently  papers  have  been  published  dealing  with  some  of  the 
perceived  ‘errors’  in  G.P  [37,  40,  43].  and  explaining  how  these  can  be  avoided 
by  the  correct  setting  of  weights,  goals,  priority  levels  etc. 

The  remainder  of  the  paper  will  be  divided  into  four  sections.  Section  2 
will  deal  with  lexicographic(pre-emptive)  GP,  section  3  with  weighted  GP(non 
pre-emptive),  section  4  with  the  connection  between  utility  functions  and  GP, 
finally  section  5  will  draw  conclusions  as  to  the  current  direction  of  GP  and  the 
direction  of  the  authors'  future  research. 

2  Lexicographic  GP 

Of  the  355  papers  mentioned  by  Romero  [37],  226  use  the  concept  of  Lexico¬ 
graphic  GP(LGP),  which  requires  the  pre-emptive  ordering  of  priority  levels. 
The  standard  LGP  model  can  be  algebraically  represented  as; 

Lex  min  a  =  (5i(n,  p),  ^2(n,  p), . ,  gKin,  p)) 

subject  to, 

/,  (x)  +  n,-  -  Pi  =  bi  i  =  1, . ,  m 

This  model  has  K  priority  levels,  and  m  objectives,  a  is  an  ordered  vector  of 
these  K  priority  levels. 

A  standard  ‘g’  (within  priority  level)  function  is  given  by: 

?*(n,p)  =  ot.ni  -f- . +  0k^Pi  + . +  /?t„Pm 

This  paper  will  summarize  the  development  of  algorithms  to  solve  the  LGP 
model,  work  on  the  multi-dimensional  dual  [30,  44],  and  current  thinking  on 
methods  of  priority  ranking  and  weighting  within  the  priority  levels.  Some 
applications  of  LGP  will  be  commented  on,  in  an  effort  to  outline  which  types 
of  problem  are  suitable  for  an  LGP  approach,  and  which  are  better  solved  using 
other  techniques. 

3  Weighted  GP 

Weighted  (or  non-pre-emptive)  GP(WGP)  requires  no  pre-emptive  ordering  of 
the  objective  functions.  Instead  all  the  different  deviations  are  placed  in  a  single 
priority  level  objective  with  different  weights  to  represent  their  importance. 
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Algebraically,  a  WGP  has  the  following  structure: 

k 

Min  a  =  +  /?,p,) 

i=l 

Subject  to, 

/i(x)  +  n,-  -  Pi  =  bi  i  =  1, . m 

X  €  C. 

Where  C,  is  an  optional  constraint  set.  Of  interest  here  are  the  problems 
caused  by  incommensurability,  i.e.  objective  functions  being  measured  in  differ¬ 
ent  units,  and  techniques  used  to  overcome  this.  As  in  the  LGP  case,  application 
areas  will  be  outlined. 

4  Utility  Functions 

The  third  section  will  deal  with  the  connections  between  utility  functions  and 
the  different  types  of  GP.  It  will  explore  the  literature  on  the  problems  caused  in 
reconciling  LGP  and  utility  function  theory.  It  will  also  examine  recently  devel¬ 
oped  techniques  to  model  GP’s  more  closely  around  their  underlying  objective 
functions  [45]. 

5  Summary  and  Conclusions. 

The  final  section  will  draw  conculsions  as  to  the  scope  and  limitations  of  GP 
and  highlight  areas  in  which  the  authors  intend  to  conduct  further  research. 
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rne  Economic  Lot  Scheduling  Problem  (ELSP)  is  to  economically 
schedule  lots  of  one  or  more  products  on  a  single  machine. 
Oemartd  Is  constant,  backlogging  Is  not  allowed  and  the  planning 
horizon  is  Infinite.  The  problem  Is  to  minimize  total  operating  cost  per 
unit  time  which  is  comprised  of  setup  costs  and  inventory  costs. 
Setup  costs  are  irK:urred  whertever  a  production  for  a  lot  Is  begun 
and  inventory  carrying  costs  can  be  defined  as  the  time  value  of 
nrxjriey  tied  up  m  Inventory. 

An  extortion  to  single  mochlne/facility  problem  is  the  study  of 
environments  where  products  are  manufactured  through  several 
operations.  Such  systems  are.  In  general,  called  as  multi-stage 
production  systems.  Multi-stage  production  systems  received  a  lot  of 
academic  atterttion  In  recent  years  focusslr^  on  the  control  of  woik- 
in-process  Inventory  and  Its  functional  relotlortship  to  the 
manufacturing  cycle  time.  It  Is  a  very  well  known  fact  by  rK>w.  the 
larger  the  production  lot  size,  the  longer  the  manufacturing  cycle, 
which  In  turn.  Increases  the  work-hvprocess  inventory.  There  exists  a 
vast  literature  modelling  this  relationship  to  varying  degrees  In 
dlffererrt  modeb  for  cffterertt  system  configurations. 
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The  Multhstoo®  Economic  Lot  ScheduUng  Problem  (MS-ELSP)  brings 
together  tvro  Important  problem  characteristics  Inherent  to  muttWtem 
and  multi-stage  problems.  In  a  multi-item  problem,  the  main  Issue  Is 
that  of  creating  schedules  Which  avoids  the  interference  that  is  likely 
to  occur  vriien  two  or  more  products  compete  for  the  same  facility. 
We  will  refer  to  this  as  the  ’feasibility  Issue*.  In  a  multi-stage 
environment,  the  production  should  be  synchronized  so  that 
concurrent  production  of  the  same  lot  Is  not  possible  In  the 
consecutive  stages.  This  characteristic  leads  to  the  definition  of  work- 
in-process  Inventory  which.  In  fact.  Is  a  tool  for  the  synchronization  or 
production  among  stages.  Thus,  in  multi-stage  problems,  creating 
schedules  owing  this  property  will  be  referred  to  as  ’consistency 
Issue'.  This  study  addresses  the  Multi-stage  Economic  Lot  Scheduling 
Problem  with  the  objective  of  determining  feasible  and  consistent 
schedules  which  result  from  the  conventional  tradeoff  between  setup 
costs  and  Inventory  holding  costs  comprising  the  total  cost  of  a 
schedule. 

In  this  research,  we  restrict  the  study  of  MS-ELSP  to  serial  systems 
where  there  are  m  products  to  be  manufactured  through  n  distinct 
stages.  We  first  analyze  the  two  product  -  two  stoge  problem.  In 
order  to  guarantee  feasibility,  common  cycle  solutions  in  which  the 
possible  values  of  cycle  times  for  an  Items  ore  constrained  to  a  single 
cycle  time  value.  T,  are  sought  for.  In  a  two-stage  production 
system,  production  of  a  lot  on  the  second  stage  conrrot  begin  until 
Its  production  on  the  first  stage  is  completed.  Therefore,  production 
between  stages  should  be  synchronized  so  that  we  end  up  with 
consistent  schedules.  To  ensure  consistency,  we  deflr>e  a  constraint 
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for  eocTi  product  which  also  provide  Information  about  the  wort<-lrv 
process  Inventories. 

Another  Important  pofr^t  In  this  study  Is  the  presence  of  nonnegative 
setup  times.  Setup  times  mean  a  loss  In  the  productive  capacity  and 
their  effect  on  lot  sizes  Is  the  most  significant  when  the  capacity 
utilization  is  high.  On  the  other  hand,  work-ln-process  Inventories 
tend  to  irv:rea$e  with  lr«creasing  capacity  utilizations.  Therefore. 
Ignorance  of  setup  times  will  result  In  overestimated  lot  sizes  due  to 
underestimation  of  work-in-process  Inventories. 

The  mathematical  programming  formulation  of  this  problem  Is 
developed  where  the  objective  function  is  nonlinear  with  a  Hneor  sot 
of  constraints.  Setting  the  cycle  time  to  a  fixed  value,  we  first 
linearize  the  objective  function.  By  using  the  duel  problem  orrd 
complementary  slackness,  the  optimal  solution  of  this  problem  and 
thus  the  optimal  cycle  time  for  the  two  product  -  two  stage  problem 
are  obtained.  Besides,  we  hove  the  exact  terms  for  fhe  work-ln- 
process  inventories  (queueing  inventories:  Inventory  that  built  up  on 
the  previous  stage  If  ttte  successor  stage  is  busy  with  processing  the 
other  products)  since  they  can  be  expressed  explicitly  os  analytical 
functions  of  the  cycle  time.  Then,  we  generalize  our  result  to  multi¬ 
product  case  In  a  two  stage  system  which  constitutes  a  basis  for  the 
analysis  of  the  m-product.  n-stage  economic  lot  scheduling 
problem. 

Key  words:  Economic  Lot  Scheduling  Problem,  multi-stage 
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■!.  this  pap«i'  we  present  oomputaticnal  experience  with  a  primal-dual  interior  point  for  smooth  convex 
programming  problems  of  the  type 

min 

s.t.  (1) 

9{x)  <  0, 

whave  c  =  R"  and  g  :  R"  — •  R™  ts  a  vector-\alued  function.  We  assume  that  each  component  pj  is 
convex  Let  s  t  R"',  be  the  vector  of  slack  variables.  The  inequality  constraints  in  (1)  are  replaced  by 

g(x)  s  =  0,  5  >  0. 

Given  a  parameter  u  >  0,  we  associate  with  (1)  the  barrier  problem 

m 

min  c^i  -  /I  2  In  Si 
1=1 

s.t.  (2) 

gix)  -i-  s  =  U 
s  .>  0. 

vVe  assume  that  Slater's  condition  holds; 

.•assumption  0.1  Tkerr  ts  an  i  €  R"  .'tic/i  that  p(i)  <  0. 


V.'e  also  aissumo 


hssmnption  0.2  The  set  {s  ;  'fix)  <  0  and  c^x  <  t>}  is  bcmnded  for  oil  6. 

Under  these  assumptions  Problem  (2)  has  a  solution.  The  necessary  and  sufficient  conditions  for  opti¬ 
mality,  namely  the  Karush-Kulin-T\icker  equations,  or  KKT  equations,  are 


with  s  >  0  and  y  >  0.  Here 

is  the  Jacobian  matrix  of  g  and  y  e  R”*  is  a  vector  of  dual  variables. 
Let 

F :  R™  X  R"*  X  R"  —  R"  X  R"  X  R" 
be  a  multi-valued  function  defined  by 


’I's  —  ^  = 

0 

(3) 

p(x)  +  s  = 

0 

(4) 

- 

0, 

(5) 

do  f  dgi(x) ' 

dx  \  dxj 

Fc\ 

/y*-.pe 

=  (  ^ 

V  t 

with  i  —  (y, »,  x).  F  also  depends  on  the  parameter  p  >  0.  With  this  notation,  the  KKT  system  is  simply 

F(i)  =  0. 

We  also  introduce  the  Lagrangean 


L(y;i,  s)  =  c^z  +  y^(p(x)  +  s). 


(6) 
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The  KKT  system  (3)  -  (5)  can  be  rewritten  as 

dL  dl 

ya-/ie  =  0,  —  =0,  and  —  =  0. 

ax  dy 

Following  usual  terminology,  a  point  z  =  iy,a,x)  is  interior  if  y  >  0  and  s  >  0.  We  do  not  require  it  to 
be  primal  or  dual  feasible.  At  such  a  point,  we  define  the  Newton  direction  dz  =  {dy,ds,dx)  by 


Note  that 


dz 


with 


a^L 

dyds 


=  /, 


ap 

+  F  = 

0. 

S 

Y 

\ 

0 

0 

a^L 

a^L 

dyds 

dydx 

a'^L 

0 

a^L 

dxdy 

dx^  ! 

a^L 

-  /'££' 

)  .  and 

dxdy 

V5i> 

(V) 


^  d’^Qi 

dx^  dx^' 


Since  the  gi  are  convex,  is  positive  semi-definite.  Let  us  make  the  further  assumption 

Assumption  0.3  Let  y  >  0  and  j  >  0.  The  malnx 

dH  a^L  ,  a^L 

di’  dzdy  dydi 

is  positive  definite. 

A  sufficient  condition  for  that  is: 

a^L  _  d^Qi 


<=1 


is  positive  definite,  or  ||  has  full  row  rank,  or  both. 

Under  Assumption  0.3,  ^  is  regular  at  any  interior  point.  Thus 


—  (£)“^ 


Let  us  explicitly  write  and  solve  the  system  (7)  in  dy,  ds  and  dx: 

Sdy  +  Yds  +  F^  =  0 

a^L  ^  a'^L,  „ 

_d5,+  _dx-HF,  =  0. 


In  these  expressions  we  used  the  fact  that  =  /. 
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The  algorithm  goes  as  follows:  Given  an  interior,  but  not  necessarily  feasible,  point,  we  compute  the 
search  direction  dz  associated  with  ft.  Then  a  step  is  taken  along  that  direction  such  that  the  interior 
property  is  maintained.  Namely,  let  &  :=  max{a  :  y  ody  >  0,  s  +  ads  >  0}  and  let  0  <  7  <  1 .  Then 
the  next  iterate  is  given  by 


X  :=  I  +  r/adx 
s  :=  s  +  "fads 
y  :=  y  +  7ady. 


Te  choice  of  ft  is  adaptive.  For  “normal”  steps,  we  take  ft  =  If  miny<Si  <  7^,  the  vector  Ki  is 
considered  excessively  unbalanced  and  we  take  ft  =  This  step  is  named  “centering”. 

We  tested  our  algorithm  on  a  sample  of  medium  size  random  problems.  We  primarily  studied  the  effect 
of  varying  the  size  of  the  problems.  We  observed  that  the  number  of  iterations  increases  slowly  with  the 
number  of  constraints  and,  surprisingly  enough,  it  decreases  with  the  number  of  free  variables  in  the  case 
of  quadratically  constrained  problems. 


We  analyzed  the  influence  of  centering  and  showed  it  to  be  positive.  We  also  studied  alternative  strategies 
for  the  step  size.  It  turns  out  that  taking  a  fixed  fraction  of  the  maximal  step  size  works  well  in  practice. 
Moreover  the  fraction  can  be  extremely  close  to  1  without  any  negative  effect  on  the  performance  of  the 
method.  Finally,  we  looked  at  different  choices  for  the  starting  point. 


We  applied  this  algorithm  to  linear  programming  problems.  The  algorithm  behaves  a  bit  differently  than 
with  quadratic  constraints.  The  iteration  count  increases  both  with  the  number  of  constraints  and  the 
’lumber  of  free  variables.  For  the  former  the  increase  is  slower.  The  figures  are  reasonable. 
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1  INTRODUCTION 

Mathematical  programming  and  theory  of  scheduling  have  a  lot  of  optimiza¬ 
tion  problems  which  are  NP-hard  in  spite  of  their  very  simple  structure.  Thus  these 
problems  are  considered  to  be  difficult  to  solve.  But  some  of  them  are  easy  in  the 
sense  that  there  are  straightforward  ways  to  generate  feasible  solutions  of  them,  e.g. 
the  knapsack  problem,  the  TSP  problem  and  many  scheduling  problems. 

One  of  them  is  the  scheduling  of  identical  parallel  machines  where  the  max¬ 
imal  completion  time  has  to  be  minimized.  This  problem  is  the  topic  of  this  ex¬ 
perimental  study.  It  has  several  heuristics.  The  two  basic  ones  are  Graham’s  list 
scheduling  and  the  multi-fit  algorithm.  There  are  known  upper  bounds  for  the 
relative  accuracy  of  the  heuristic  solutions  provided  by  these  methods.  The  two 
algorithms  have  quite  different  strategies.  This  is  the  reason  that  some  problems 
worst  from  the  point  of  view  of  list  scheduling  can  be  solved  exactly  by  the  multi-fit 
algorithm  and  vice  versa.  This  gives  the  question  that  how  bad  accuracy  can  have 
the  better  of  the  list  scheduling  and  the  multi-fit  solutions.  This  was  the  initial  ques¬ 
tion  of  this  research.  Another  algorithm  called  interchanging  method  has  been  also 
investigated.  The  research  made  necessary  to  sharpen  the  well-known  lower  bound 
of  the  optimal  value  of  the  objective  function,  too. 

•2  THE  SCHEDULING  PROBLEM 

In  the  cla.ssical  problem  of  the  scheduling  of  parallel  machines  n  jobs  have 
to  be  distributed  among  m  identical  machines  in  such  a  way  that  the  makespan  is 
minimal. 

The  whole  operation  starts  at  time  0.  The  machine  independent  processing 
times  are  denoted  by  pfij  =  1,...,ti)  which  are  positive  integers.  It  is  easy  to  see 
that  there  is  at  least  one  optimal  solution  such  that  the  machines  start  to  work  at 
t=0  and  are  working  without  any  idle  time  until  all  jobs  assigned  to  them  have  been 
finished. 

Let  Cj  be  the  completion  time  of  job  j.  The  maximal  completion  time,  i.e. 
max{C;  ;  j  =  l,...,7i},  is  denoted  by  C*. 

Theorem  1  [Gr-aham  69],  [Coffman  et  al.  78]  In  any  problem 

max{;Li:"^,  py,max{p^  :  j  =  l,...,n)} 

<  O'  <  (1) 

niax{^i:"=iP7.niax{p^  :  j  =  l,...,ri}}.a 
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The  interval  in  which  the  optimal  value  must  lie  is  denoted  by  [L,U],  i.e. 

1  ” 

L  =  fmax{— Vpj,max{py  :  ; ■  =  (2) 

and 

2  " 

U  =  [max{— y!pj,max{pj  :  )  =  (3) 

111 

Both  the  list  scheduling  and  the  multi-fit  algorithm  start  with  the  deter¬ 
mination  of  the  nonincreasing  order  of  the  processing  times.  The  two  algorithms 
assign  the  jobs  to  machines  in  that  order.  Therefore  without  loss  of  generality  it 
may  be  assumed  that 

Pi  >  Pi  >  •••  >  Pn  (4) 


The  rule  of  the  list  scheduling  is  that 


every  job  is  assigned  to  a  machine  having  minimal  current  load. 


Theorem  2  [Graham  69]  Let  C(LS)  be  the  value  of  the  solution  provided  by  the  list 
scheduling.  Then 


C(LS)  <  4  _  _1_^ 
C*  3  3m 


(5) 


Theorem  3  [Graham  69]  If  there  is  an  opt  imal  solution  which  assigns  to  each  ma¬ 
chine  at  most  S  jobs,  then  the  solution  given  by  the  list  scheduling  is  optimal.  □ 


The  multi-fit  algorithm  consists  of  two  parts.  A  greedy  method  is  the  in¬ 
ternal  part  and  a  logarithmic  st^an  h  is  the  external  part  which  organizes  the  ap¬ 
plications  of  the  greedy  method.  I’or  the  internal  part  an  upper  bound  A'  of  the 
optimal  value  is  assumed.  The  greedy  method  assigns  each  job  to  the  first  machine 
into  it  fits  not  exceeding  the  upper  bound  A'.  In  the  external  part  a  current  lower 
bound  and  a  current  upper  bound  are  assumed  and  are  denoted  by  Ic  and  uc.  For 
the  internal  part  A'  is  chosen  as  .  If  the  greedy  method  was  able  to  find  a 
solution  pot  worst  then  K,  then  uc  becomes  [/fj,  otherwise  Ic  =  \K'\.  The  process 
is  repeated  until  the  condition 

uc  =  Ic 

is  not  satisfied.  Notice  that  it  follows  from  the  assumption  of  the  integrality  of  the 
processing  times  that  the  number  of  applications  of  the  greedy  method  is  0{log{U  - 
L)).  Thus  the  multi-fit  algorithm  is  polynomial. 


Theorem  4  [Friesen  84]  Let  C{M F)  be  the  value  of  the  solution  provided  by  the 
multi-fit  algorithm.  Then 


C{MF) 

C' 


<  1.2  □ 


(6) 
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A  third  heuristic  method  called  iiitercliangiug  algorithm  has  been  applied 
in  this  research.  It  makes  the  following  steps  starting  from  any  solution.  It  inter¬ 
changes  one  job  of  the  most  loaded  machine  with  one  job  of  another  machine.  The 
interchange  is  possible  if  and  only  if  the  maximal  completion  time  is  decreased  in 
this  way.  Let  s  and  t,  resp.,  be  indices  of  the  most  loaded  and  the  other  machines, 
resp.  The  current  load  of  the  machines  are  denoted  by  L,  and  Lt.  Suppose  that  the 
job  t  of  machine  s  is  interchanged  with  job  j.  Then  the  following  two  conditions 
must  hold 


P.  >  Pi  (7) 

and 

Lt  -{■  Pi  -  pj  <  L,.  (8) 

In  the  current  ver.^ion  if  a  possible  interchange  is  found  then  it  has  been  executed. 
The  order  of  checking  Conditions  (7)  and  (8)  is  as  follows.  The  jobs  of  the  most 
loaded  machine  are  compared  with  the  jobs  of  another  machine  taking  the  other 
machines  in  an  increasing  load  order.  The  jobs  of  the  two  machines  are  taken  in  a 
decreasing  processing  time  order.  One  job  of  the  most  loaded  machine  is  compared 
with  all  of  the  jobs  of  the  other  machine.  If  no  possible  interchange  is  found  then  the 
next  job  of  the  most  loaded  machine  is  taken.  The  number  of  comparisons  of  one 
iteration  are  O(n^).  To  get  a  polynomial  algorithm  the  number  of  interchanges  has 
been  limited  by  m  -I-  2.  In  the  current  version  the  solution  provided  by  list  schedul¬ 
ing  is  the  starting  point.  This  algorithm  is  one  of  simplest  possible  interchanging 
methods.  In  more  general  a  subset  of  jobs  can  be  interchanged  for  another  subset  of 
jobs.  In  that  ca.se  the  complexity  of  the  selection  of  the  two  subsets  is  much  higher. 

;l  IMPROVEMENTS  OF  THE  LOWER  BOUND 

The  randomly  generated  problems  have  not  been  solved  with  any  kind  of 
enumerative  methods.  One  easy  way  to  prove  the  optimality  of  a  solution  is  that 
the  value  of  it  and  the  lower  bound  coincide.  Therefore  it  was  important  to  find 
some  ways  to  improve  the  lower  bound. 

In  (2)  only  two  information  are  taken  into  consideration,  the  average  load 
and  the  maximal  processing  time.  The  following  two  sharpening  of  the  lower  bound 
are  based  on  the  fact  that  what  is  the  number  of  jobs  which  must  be  assigned  to 
certain  machines. 

Theorem  5  Assume  that  (4)  holds.  Then 

C*  >  "•  +  P"  ° 

Theorem  6  Assume  that  (4)  holds.  Let 
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//r  >  1 


=„-fi]r+I  Pi 


(10) 


!n  some  cases  there  are  jobs  which  are  not  effecting  C*,  because  their  pro¬ 
cessing  times  are  relatively  very  small.  In  that  cases  the  following  observation  is 
useful. 


Theorem  7  Ltt  S  be  any  subset  of  the  jobs.  Let  be  any  lower  bound  for  the 
problem  defined  by  the  jobs  those  in  .S'.  Then  is  a  lower  bound  for  the  original 
problem.  □ 

Theorem  8  Let  k  be  any  index  with  1  <  /:  <  n.  Assume  that:  (i)  the  list  scheduling 
has  assigned  until  that  point  exactly  k  jobs  to  machines,  (ii)  none  of  the  machines 
has  more  than  two  jobs,  (Hi)  Pk-2  +P*:-i  +  p*  at  least  as  great  as  the  current  load 
of  any  machine.  Then  the  current  maximal  load  is  a  lower  bound  for  the  optimal 
value  of  the  problem.  □ 


Theorem  9  Let  k  be  a  fixed  index  and 


t,  = 


Pkj 


Then 


j  =  l,...,n. 

Pk  <  C’. 


HU  t, 

rn 

•1  COMPUTATIONAL  E.XPERIENCES 


The  coinpulatioiial  experienres  have  l)een  made  in  three  phases.  In  the  first 
phase  about  000.000  proi)lems  Ix'longing  to  different  classes  have  been  generated. 
In  this  phase  .sojne  observations  iiave  been  made  which  modified  the  objectives  of 
the  research.  The  second  phase  was  the  main  one  in  which  1.200.000  problems 
have  been  generated  in  a  wide  range  of  problem  classes  to  find  difficult  problems. 
Further  attempts  have  been  made  to  find  more  difficult  problems  in  the  most  hopeful 
problem  classes. 


Definition  1  LctC{LS)  and  C{M  F)  and  C{IC)  and  C*  be,  resp.,  the  value  of  the 
solution  provided  by  the  list  scheduling  and  the  multi-fit  algorithm  and  the  inter¬ 
changing  method  and  of  the  optimal  solution,  resp.  A  particular  problem  is  called 
first  order  difficult  if  the  value 

mm{C{LS),C(MF)} 

C*  ^ 

is  high.  It  is  called  second  order  difficult  if  the  value 


is  high. 


min{C{LS),  C{IC),  CiM  F)} 

C’ 


mw{CiIC),C(MF)} 

C* 


(12) 


i 
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This  definition  is  not  correct  in  a  strict  mathematical  sense,  because  the 
meaning  of  the  word  "high"  is  undefined.  This  meaning  has  been  determined  during 
the  experiences. 

A  problem  class  is  determined  by  the  following  parameters;  m  -  the  number 
of  machines,  n  -  the  number  of  jobs,  p  -  the  maximal  possible  processing  time;  the 
processing  times  are  generated  randomly  by  the  (l,p]  integer  uniform  distribution. 

In  the  experiences  the  following  formulas  have  been  used  instead  of  (11) 

and  (12) 

mm{C{LS),C(MF)) 

- J -  {ixjf 

Lj 


F)) 

L 

where  L  is  some  lower  bound  of  the  optimal  value  of  the  objective  function. 


4.1  Observations  of  the  First  Phase 


In  the  first  phase  only  the  list  scheduling  and  the  multi-fit  algorithm  have 
been  used. 

At  the  beginning  of  the  experiences  L  has  been  chosen  as  L.  Some  problems 
seemed  to  be  difficult  although  an  optimal  solution  has  been  obtained  by  one  of  the 
methods.  In  some  cases  this  fact  could  be  proven  by  one  of  the  improvements  of  the 
lower  bound  discussed  in  Section  3. 

Some  problems  had  just  the  opposite  behaviour.  Here  the  lower  bound 
coincided  with  the  optimal  value.  In  many  ca.ses  this  fact  could  be  proven  by  the 
interchanging  algorithm.  This  is  the  reason  that  this  method  had  to  be  involved 
into  the  investigations. 

Among  the  most  difficult  problems  found  in  this  phase  there  were  many 
such  that  the  smallest  processing  time  was  relatively  great.  Therefore  in  the  second 
phase  of  the  experiences  the  generation  of  the  the  problems  has  been  modified  as 
follows.  The  first  thousand  problems  has  been  generated  as  earlier.  In  the  case  of 
the  problems  of  the  second  thousand  the  processing  times  were  increased  by  1,  in 
the  case  of  the  third  thousand  by  2,  e.t.c.  This  cannot  be  applied  for  all  of  the 
classes,  because  in  some  cases  if  the  increase  is  not  less  than  a  certain  value,  the 
problem  regardless  the  generated  random  numbers  becomes  trivial. 

The  problems  which  seemed  to  be  difficult  were  belonging  to  two  different 
categories.  The  first  one  is  the  set  of  first  order  difficult  problems.  The  most  difficult 
problem  in  this  sense  was  the  following,  n  =  10,  m  =  3  and  the  processing  times  are 
30,  29,  24,  18,  17,  17,  17,  14,  13,  13.  The  solution  provided  by  the  list  scheduling 
is  as  follows:  Ml:  30,  17,  13;  M2:  29,  17,  14;  M3:  24,  18,  17,  13.  The  multi-fit 
solution  is:  Ml:  30,  29,  13;  M2:  24,  18,  17,  13;  M3:  17,  17,  14.  Both  of  them  have 
the  value  72.  But  the  optimal  solution  is  the  following:  Ml:  30,  17,  17;  M2:  29,  18, 


17;  M3:  24,  14,  13,  13.  The  value  of  it  is  64.  Since  that  time  Definition  1  has  had 
the  meaning  that  a  problem  is  first  order  difficult  if 

mn^{C{LS),C(M F)}  ^  9 
C-  ^  8' 

The  computationally  difficult  problems  belong  to  the  second  category.  In 
the  case  of  such  a  problem  it  is  difficult  either  to  find  the  optimal  solution  or  to 
prove  the  optimality  of  a  solution  generated  by  one  of  the  heuristics. 

4.2  Experiences  of  the  Main  Phase 

In  the  second  phase  an  intensive  search  has  been  carried  out  for  difficult 
problems.  100.000  problems  have  been  generated  in  each  of  the  problem  classes. 
The  generated  solutions  are  within  105%  and  even  101%  of  the  improved  lower 
bound  in  the  case  of  a  very  great  part  of  the  problems  in  each  class.  These  results 
are  summarized  in  Table  1. 

It  turned  out  that  none  of  the  list  scheduling  and  the  multi-fit  algorithm  is 
superior  to  the  other  one.  This  is  indicated  by  the  numbers  of  problems  such  that 
the  appropriate  heuristic  solution  is  within  101%.  The  number  of  problem  classes 
for  which  a  method  is  superior  to  the  other  one  is  approximately  is  the  same  for 
both  algorithms.  The  behaviour  of  both  methods  are  very  different  in  the  different 
classes.  But  the  ”the  better  of  list  scheduling  and  multi-fit”  seems  to  be  much  stable. 


n/m/p 

101% 

ios^" 

105% 

LS-MF 

IC-MF 

LS-MF 

IC-MF 

10/3/15 

88067 

94179 

98817 

99897 

15/3/15 

95177 

99441 

99997 

99999 

73970 

85831 

97418 

rotnyiai 

15/3/30 

89662 

99002 

99999 

10/3/60 

45787 

69949 

94304 

99582  II 

—WBffiftl 

80167 

98599 

99996 

—iroyiiii 

mmm 

100000 

liWlfilTil 

Tililtliiqi 

100000 

100000 

99132 

100000 

100000 

100000 

mriifildi 

98244 

98245 

99043 

99044 

89275 

97461 

99995 

100000 

100000 

100000 

100000 

■nmmuMI 

T. 

960051 

1043707 

1089569 

1098337 

percentage 

87.27 

94.88 

99.05 

99.85 

Table  1:  The  numbers  of  problems  having  good  heuristic  solution 
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f  parameters 

MF 

LS 

10/3/15 

87795 

1318 

15/3/15 

7324 

93662 

10/3/30 

73529 

950 

15/3/30 

12688 

86078 

10/3^0 

44898 

1568 

15/3/60 

23804 

72245 

30/3/15 

26577 

99725 

30/3/30 

50626 

99591 

30/3/60 

42090 

99132 

10/5/15 

98236 

97997 

20/5/15 

3649 

87973 

60/5/^ 

100000 

99111 

Table  2:  Comparison  of  the  list  scheduling  and  multi-fit  heuristics 

The  most  first  order  difficult  problem  which  has  been  found  in  this  phase  is 
the  following,  n  =  10,  m  =  3  and  the  processing  times  are  15,  14,  12,  9,  8,  8,  8,  7, 
6,  6.  The  solution  provided  by  the  list  scheduling  is  as  follows:  Ml:  15,  8,  6,  6;  M2: 
14,  8,  7;  M3:  12,  9,  8.  The  multi-fit  solution  is:  Ml:  15,  14,  6;  M2:  12,  9,  8,  6;  M3; 
8,  8,  7.  Both  of  them  have  the  value  35.  But  the  optimal  solution  is  the  following: 
Ml:  15,  8,  8;  M2:  14,  9,  8;  M3:  12,  7,  6,  6.  The  value  of  it  is  31. 

4.3  Further  difficult  Problems 

The  aim  of  the  third  phase  has  been  to  find  further  difficult  problems.  Some 
new  problem  classes  arc  introduced,  because  it  is  likely  on  the  basis  of  the  previous 
experiences  that  these  classes  contain  the  desired  items.  At  the  end  of  this  phase 
the  number  of  the  generated  problems  have  exceeded  2.000.000. 

The  class  19/8/15  contained  the  known  most  difficult  problem.  The  pro¬ 
cessing  times  of  it  are:  21,  21,  20,  20,  19,  IS,  17,  17,  16,  16,  16,  16,  12,  12,  12,  11, 
11,  10,  10.  The  multi-fit  solution  is:  Ml:  21,  21;  M2:  20,  20;  M3:  19,  18;  M4:  17, 
17;  M5:  16,  16,  10;  M6:  16,  16,  10;  M7:  12,  12,  12;  M8:  11,  11.  The  value  of  it 
is  42,  which  is  achieved  at  Ml  and  M5  and  M6.  The  solution  provided  by  the  list 
scheduling  with  value  43  is  this:  Ml:  21,  11,  11;  M2:  21,  12;  M3:  20,  12,  10;  M4: 
20,  12,  10;  M5:  19,  16;  M6:  18,  16;  M7:  17,  16;  M8:  17,  16;  In  the  optima;  solution 
the  completion  time  is  37  on  all  of  the  machines  except  the  last  one  where  it  is  36: 
Ml:  21,  16;  M2:  21,  16;  M3:  20,  17;  M4:  20,  17;  M5:  19,  18;  M6:  16,  11,  10;  M7: 
16,  11,  10;  M8:  12,  12,  12. 

The  development  of  the  accuracy  of  the  most  known  first  order  difficult 
problems  has  been:  |  <  ff  <  If-  The  value  42/37,  which  in  not  proven  to 
be  an  upper  bound,  is  less  than  the  value  72/61  guaranteed  by  the  algorithm  of 
[Friesen- Langston  86],  which  uses  many  operations  from  a  practical  point  of  view. 

There  was  no  improvement  in  the  position  of  most  second  order  difficult 
problem  in  this  phase. 
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4.4  Some  Other  Observations 

Some  other  observations  are  obtained  from  the  experiences.  An  important 
one  is  the  following.  If  L  ^  U  then  U  is  far  from  the  optimal  value.  The  10/5/15 
class  is  the  only  one  where  the  ratio  (13)  had  a  value  greater  then  1.22.  The  observed 
greatest  value  is  1 .42. 

The  aim  of  the  improvements  of  the  lower  bound  was  to  decreaise  the  number 
of  cases  to  check.  In  Table  5  the  the  number  of  problems  which  have  been  proved 
to  be  solved  within  101%,  and  the  observes  worst  (14)  ratio  observed  before  any 
improving  and  after  improving  (without  the  application  of  Theorem  9)  are  provided 
for  the  better  of  multi-fit  and  interchanging  procedure. 


parameters 

10 

before 

1% 

after 

(1 

before 

4) 

after 

10/3/15 

88078 

89871 

1.217 

1.120 

15/3/15 

99344 

99344 

1.030 

1.030 

10/3/30 

65089 

68313 

1.262 

1.102 

l|■QSig2QJ|| 

99098 

99098 

1,032 

44046 

71572 

1.211 

1.100 

99045 

99045 

1.032 

1.032 

30/.3/15  1 

100000 

100000 

1.005 

1.005 

100000 

100000 

1.007 

1.007 

100000 

100000 

1.005 

1.005 

10/5/15 

37056 

88469 

\msM 

1.200 

97240 

97240 

1.040 

GO/5/60  1 

100000 

100000 

100000 

100000 

Table  3;  The  effect  of  the  improvements  of  the  lower  bound 
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Abstract 

Tabu  Search  is  a  metastrategy  for  guiding  known  heuristics  to  overcome  ioral  op¬ 
timality.  Successful  applications  of  this  kind  of  metaheuristic  to  a  great  variety  of 
problems  have  been  reported  in  the  literature.  Recently  some  implementations  of 
tabu  search  on  parallel  computers  have  come  up.  Whereas  these  implementations 
are  tailored  to  specific  problems  we  attempt  to  provide  ideas  for  a  more  general 
concept  for  developing  parallel  tabu  search  algorithms. 


1  Introduction 

Due  to  tlie  complexity  of  a  great  variety  of  combinatorial  optimization  problems,  iieurisiic 
algorithms  are  especially  relevant  for  dealing  with  large  scale  problems.  The  main  draw¬ 
back  of  algorithms  such  as  deterministic  exchange  procedures  is  their  inability  to  continue 
the  search  upon  becoming  trapped  in  a  local  optimum.  This  suggests  consideration  of 
recent  techniques  for  guiding  known  heuristics  to  overcome  local  optimality.  Following 
this  theme,  the  application  of  the  tabu  search  metastrategy  for  solving  combinatoria! 
optimization  problems  is  investigated. 

The  key  issue  in  designing  parallel  algorithms  is  to  decompo.se  the  execution  oi  the 
various  ingredients  of  a  procedure  into  processes  executable  by  parallel  processors.  Impro¬ 
vement  procedures  like  tabu  search  or  simulated  annealing  at  first  glance,  however,  liave 
an  intrinsic  sequential  nature  due  to  the  idea  of  performing  the  neighbourhood  searen 
from  one  solution  to  the  next.  Therefore,  tliere  is  not  yet  a  common  or  generally  applica¬ 
ble  parallelization  of  tabu  search  in  the  literature.  In  the  sequel  we  attempt  to  describe 
some  general  ideas  and  a  classification  scheme  for  parallel  tabu  search  algorithms. 

In  Section  2,  we  present  an  outline  of  tabu  search.  Before  describing  some  concepts 
for  parallel  tabu  search  algorithms  in  more  detail  (see  Section  4),  we  briefly  discuss  some 
of  the  common  parallel  machine  models  and  algorithms  in  Section  3.  Some  examples 
are  given  in  Section  5  and  finally  some  conclusions  are  drawn  (Section  6).  The  attempt, 
of  course,  is  not  to  give  a  complete  treatment  of  [larallel  tabu  search  but  to  sketch  the 
potential  this  area  of  research  carries.  For  a  more  iletailed  treatment  of  the  ideas  of  this 
pa|>er  s<!e  VoB  (1992). 
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2  Tabu  Search 

Many  solution  approaches  are  characterized  by  identifying  a  neighbourhood  of  a  given 
solution  which  contains  other  {^Iransfomxcd)  solutions  that  can  be  re2iched  in  a  single 
iteration.  A  transition  from  a  feasible  solution  to  a  transformed  feasible  solution  is  referred 
to  as  a  move  and  may  be  described  by  a  set  of  one  or  more  attributes.  In  a  zero-one 
integer  programming  context,  e.g.,  these  attributes  may  be  the  set  of  all  possible  value 
assignments  or  changes  in  such  assignments  for  the  binary  variables.  (Then  two  attributes 
denoting  that  a  certain  binary  variable  is  set  to  1  or  0,  may  be  called  complementary  to 
each  other.)  Following  a  steepest  descent/mildest  ascent  approach,  a  move  may  either 
result  in  a  best  possible  improvement  or  a  least  deterioration  of  the  objective  function 
value.  Without  additional  control,  however,  such  a  process  can  cause  a  locally  optimal 
solution  to  be  rc-visited  immediately  after  moving  to  a  neighbour. 

To  prevent  the  search  from  endlessly  cycling  between  the  same  solutions,  tabu  search 
may  be  visualized  as  follows.  Imagine  that  the  attributes  of  all  moves  are  stored  in  a  run¬ 
ning  list,  representing  the  trajectory  of  solutions  encountered.  Then,  related  to  a  sublist 
of  the  running  list  a  so-called  tabu  list  may  be  defined.  Based  on  certain  restrictions,  it 
.keeps  some  moves,  consisting  of  attributes  complementary  to  those  of  the  running  list, 
which  will  be  forbidden  in  at  least  one  subsequent  iteration  because  they  might  lead  b2w;k 
to  a  previously  visited  solution.  Thus,  the  tabu  list  restricts  the  search  to  a  subset  of  ad¬ 
missible  moves  (consisting  of  admissible  attributes  or  combinations  of  attributes).  This 
hopefully  leads  to  ’good’  moves  in  each  iteration  without  re-visiting  solutions  already 
encountered.  A  general  outline  of  a  tab»i  search  procedure  (for  solving  a  minimization 
problem)  may  be  dc8cribc<l  as  follows: 

Tabu  Search 

Given:  A  feasible  solution  i*  with  objective  function  value  z*. 

Start:  Let  x  :=  i*  with  2(x)  =  r*. 

Iteration: 

while  stopping  criterion  is  not  fulfilled'do  begin 

(1)  select  best  admissible  move  that  transforms  x  into  x'  with  objective  func¬ 
tion  value  z(x')  and  add  its  attributes  to  the  running  list 

(2)  perform  tabu  list  management:  compute  moves  to  be  set  tabu,  i.e.,  update 
the  tabu  list 

(3)  perform  exclianges:  x  :=  x',  2(x)  =  z(x') 

if  2(x)  <  z*  then  z*  :=  z(x),x*  :=  x  endif 

endwhile 

Result:  x*  is  the  best  of  all  determined  solutions,  with  objective  function  value  z*. 

*  ♦  * 

For  a  background  on  tabu  search  and  a  number  of  references  on  successful  applications 
of  this  mctaheuristic  see,  e.g.,  Glover  (1989,  1990),  Glover  and  Laguna  (1992),  and  VoC 
(1992). 

'A  possible  stopping  criterion  can  be,  e.g.,  a  prespecified  time  limit. 
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Tabu  List  Management 

Tabu  list  management  concerns  updating  the  tabu  list,  i.e.,  deciding  on  how  many  and 
wliich  moves  have  to  be  set  tabu  within  any  iteration  of  tlie  search.  Up  to  now,  tlie  most 
popular  approach  in  literature  is  to  apply  static  methods  like  the  tabu  navigation  method 
(TNM). 

In  TNM,  single  attributes  are  set  tabu  as  soon  as  their  complements  have  been  pait 
of  a  selected  move.  The  attributes  stay  tabu  for  a  distinct  time,  i.e.  number  of  iterations, 
until  the  probability  of  causing  a  solution’s  re-visit  is  small.  The  eihciency  of  the  algorithm 
depends  on  the  choice  of  the  tabu  status  duration,  i.e.  the  length  tl_size  of  the  tabu  list. 
(In  the  literature  often  a  ’magic’  tl^ize=7  is  proposed.)  For  the  sake  of  an  improved 
elTectivity,  a  so-called  aspiration  level  criterion  is  considered,  which  permits  the  choice  of 
an  attribute  even  when  it  is  tabu.  This  can  be  advantageous  when  a  new  best  solution 
may  be  calculated,  or  when  the  tabu  status  of  the  attributes  prevent  any  move  from 
feasibility. 

The  static  approach,  though  successful  in  a  great  number  of  api>iications,  seems  lu 
be  a  rather  limited  one.  Another  probably  more  fruitful  idea  is  to  define  an  atiribulc 
as  being  potentially  tabu  if  it  belongs  to  a  chosen  move  and  to  handle  it  in  a  candidate 
list  first.  Via  additional  criteria  these  attributes  can  be  definitely  included  in  the  tabu 
list  if  necessary,  or  excluded  from  the  candidate  list  if  possible.  Therefore,  the  candidate 
list  is  an  intermediate  list  between  a  running  list  and  a  tabu  list.  Glover  (1990)  suggests 
the  use  of  different  candidate  list  strategies  in  order  to  avoid  extensive  computational 
effort  without  sacrificing  solution  quality.  In  the  sequel,  we  sketch  the  following  dynamic 
strategies  for  managing  tabu  lists:  the  cancellation  sequence  method  (CSM,  in  a  revised 
version,  cf.  Dammeyer  et  al.  (1991)),  and  the  reverse  elimination  method  (REM). 

CSM  as  well  as  REM  both  use  additional  criteria  for  setting  attributes  tabu.  The 
primary  goal  is  to  permit  the  reversion  of  any  attribute  but  one  between  two  solutions 
to  prevent  from  re-visiting  the  older  one.  To  find  those  ci-ilical  moves,  CSM  needs  a 
candidate  list  that  contains  the  complements  of  attributes  being  potentially  tabu.  This 
active  tabu  list  (ATL)  is  built  like  the  running  list  where  elimination  of  certain  attributes 
is  furthermore  permitted.  Whenever  an  attribute  of  the  last  performed  move  finds  its 
complement  on  ATL  this  complement  will  be  eliminated  from  ATL.  All  attributes  bet¬ 
ween  the  cancelled  one  and  its  recently  added  complement  build  a  cancellation  sequence 
separating  the  actual  solution  from  the  solution  that  has  been  left  by  the  move  that  con¬ 
tains  the  cancelled  attribute.  Any  attribute  but  one  of  a  cancellation  sequence  is  allowed 
to  be  cancelled  by  future  moves.  This  condition  is  sufficient  but  not  necessary,  as  some 
additional  aspects  have  to  be  taken  into  account  so  that  CSM  works  well. 

The  method  works  well  for  the  case  that  a  move  consists  of  exactly  one  attribute,  i.e., 
when  so-called  single-attribute  moves  are  considered  instead  of  multi-attribute  moves.  In 
addition,  the  corresponding  parameters  have  to  be  cho.scn  appropriately  (c.g.  tiic  tabu 
list  duration  of  a  tabu  attribute,  and  how  to  apply  the  aspiration  level  criterion).  Ap¬ 
plying  CSM  to  multi-attribute  moves  needs  additional  criteria  to  prevent  errors  caused 
by  uncovered  special  ca.ses.  E.g.  for  paired-attribute  moves  (moves  consisting  of  exactly 
two  attributes)  those  moves  must  be  prohibited  that  may  cancel  a  cancellation  sequence 
consisting  of  exactly  two  attributes  (because  none  of  them  is  tabu  when  choosing  a  move). 
In  addition,  for  building  a  cancellation  sequence,  the  remaining  attributes  of  the  older 
and  the  current  move  are  not  necessarily  taken  into  consideration.  This  depends  on  the 
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order  in  which  the  move’s  attributes  are  added  to  ATL. 

The  conditions  of  TNM  and  CSM  need  not  be  necessary  to  prevent  from  re-visiting 
previously  encountered  solutions.  Necessity,  however,  can  be  acliicved  by  REM.  The  idea 
of  REM  is  that  any  solution  can  only  be  rc-visiled  in  the  next  iteration  if  it  is  a  neigiibour 
of  the  current  solution.  Therefore,  in  each  iteration  the  running  list  will  be  traced  back 
to  deter!i\ine  all  moves  which  have  to  be  set  tabu  (since  they  would  lead  to  an  already 
explored  solution).  For  this  purpose,  a  residual  cancellation  sequence  (RCS)  is  built  up 
stepwise  by  tracing  back  the  running  list.  In  each  step  exzu:tly  one  attribute  is  processed, 
from  last  to  first.  After  initializing  an  empty  RCS,  only  those  attributes  are  added  whose 
complements  are  not  in  the  sequence.  Otherwise  their  complements  in  the  RCS  are 
eliminated  (i.e.  cancelled).  Then  at  each  trau:ing  step  it  is  known  which  attributes  have 
to  be  reversed  in  order  to  turn  the  current  solution  back  into  one  examined  at  an  earlier 
iteration  of  the  search.  If  the  remaining  attributes  in  the  RCS  can  be  reversed  by  exactly 
one  move  then  this  move  is  tabu  in  the  .next  iteration.  For  single-attribute  moves,  for 
instcincc,  the  letigth  of  an  IlCS  must  be  one  to  enforce  a  talni  move.  Correspondingly,  in 
a  slightly  inodined  methorl  REM2  all  common  ncighl.>onrs  of  the  current  solution  and  of 
an  already  explored  one  will  be  forbidden.  These  neighbours  were  implicitly  investigated 
during  a  former  step  of  the  procedure  (due  to  the  choice  of  a  best  non-tabu  neighbour) 
and  need  not  be  looked  at  again  (cf.  VoB  (1992)). 

Obviously,  the  execution  of  REM  and  of  REM2  represents  a  necessary  and  sufficient 
criterion  to  prevent  from  re-visiting  known  solutions.  Since  the  computational  effort  of 
REM  increases  if  the  number  of  iterations  increases,  ideas  for  reducing  the  number  of 
computations  have  been  developed  (cf.  Glover  (1990)  and  Darnmeyer  and  VoB  (1991a)). 

For  applications  and  (sequential)  comparisons  of  TNM,  CSM,  and  REM  see  Darnmeyer 
and  VoB  (1991b)  and  Domschke  el  al.  (1992). 

Search  Intensification  and  Search  Diversification 

A  general  idea  for  reducing  the  computational  effort  in  a  tabu  searcli  algorithm  is  that  of 
search  intensification  using  a  so-called  short  term  memory.  Its  basic  idea  is  to  ob  erve  the 
attributes  of  all  performed  moves  and  to  eliminate  those  from  further  consideration  that 
have  not  been  part  of  any  solution  generated  during  a  given  number  of  iterations.  This 
results  in  a  concentration  of  the  search  where  the  number  of  neighbourhood  solutions  in 
each  iteration,  and  consequently  the  computational  effort,  decreases.  Obviously  the  cost 
of  this  reduction  can  be  a  loss  of  accuracy. 

Correspondingly,  a  search  diversification  may  be  defined  as  a  long  term  memory  to 
penalize  often  selected  assignments.  Then  the  neighbourhood  search  can  be  led  into  not 
yet  explored  regions  where  the  tabu  list  operation  is  restarted  (resulting  in  an  increased 
computation  time).  An  appealing  opportunity  for  search  diversification  is  created  by  the 
idea  of  REM  and  REM2  re.sulting  in  REMl  for  t  >  2  and  integer.  If  at  any  tracing 
step  the  attributes  that  have  to  be  reversed  to  turn  the  current  solution  back  into  an 
already  explored  one  equal  exactly  t  moves  then  it  is  possible  to  set  these  moves  tabu 
for  the  next  iteration.  Note  that  for  the  case  of  multi-attribute  moves,  due  to  various 
combinations  of  attributes  to  moves,  even  more  than  t  moves  may  be  set  tabu  in  order  to 
avoid  different  paths  tiirough  the  search  space  leading  to  the  same  solution.  Accordingly, 
search  diversification  is  obvious. 
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3  Parallel  Machine  Models 

Over  the  years  a  great  variety  of  arcliitccturcs  iiavc  been  pro[)osed  for  parallel  computing. 
Tlie  most  wi<lciy  known  clas.<ti(ication  of  parallel  macliitie  models  (although  somehow 
limited)  is  given  by  Flynn  ( 19G6).  He  distinguishes  four  general  classes  based  on  the  idea 
of  whether  single  or  multiple  instruction  streams  are  executed  on  either  one  or  multiple 
data  set  streams: 

•  SISD  (Single  Insh-uciion,  Single  including  the  clcissical  sequential  computers 

•  SIMD  (Single  Inslruclton,  Multiple  Data)  including  vector  computers  and  array 
processors 

•  MISD  (Multiple  Instructions,  Single  Data) 

•  MIMD  (Multiple  Instructions,  Multiple  Data)  with  the  processors  performing  each 
successive  set  of  instructions  either  simultaneously  (synchronous)  or  independently 
(asynchronous) 

The  above  classification  of  [larallel  machine  models  may  lea<l  to  different  classes  of 
parallel  algorithms.  Vcctonzcd  algorithms  operate  uniformly  on  vectors  of  data  sets 
(SIMD).  Systolic  ones  operate  rhythmically  on  streams  of  data  sets  (SIMD  and  synchro¬ 
nous  MIMD).  Parallel  processing  algorithms  operate  on  a  set  of  synchronously  commu¬ 
nicating  parallel  processors  (synchronous  MIMD).  CoiTCspoiidiiigly,  asynchronous  com¬ 
munication  leads  to  distributed  processing  algorithms  (asynchronous  MIMD  and  neural 
networks). 

In  addition  to  architectural  aspects  communication  networks  are  used  to  classify  par¬ 
allel  machine  models.  I'or  instance,  it  makes  a  diirerence  whether  processors  have  si¬ 
multaneous  access  to  a  shared  memory,  allowing  communication  between  two  arbitrary 
processors  in  constant  time,  or  whether  they  communicAte  through  a  fixed  interconnection 
network.  I>css  formally,  in  certain  models  it  is  assumed  that  there  is  a  master  processor 
controlling  the  communication  of  the  network,  with  the  remaining  processors  of  the  net¬ 
work  called  slaves,  l  or  a  comprehensive  survey  on  parallel  machines  and  algorithms  see 
e.g.  Akl  (1989)  and  Van  Leeuwen  (1990). 

The  quality  of  parallel  algorithms  may  be  judged  by  a  number  of  quantities,  the  most 
important  one  being  the  speedup,  which  is  the  running  time  of  the  best  sequential  imple¬ 
mentation  of  the  algorithm  divided  by  the  running  time  of  the  parallel  implementation 
executed  on  a  number  of  p  processors.  Similarly,  given  a  prespecified  time  limit  (cf.  foot¬ 
note  1)  a  scaleup  may  be  defined  as  the  ratio  of  the  average  problem  sizes  solvable  with 
a  parallel  implementation  to  a  sequential  implementation  of  the  algorithm.  With  heuri¬ 
stics,  the  solution  quality  attainable  may  also  be  measured.  The  processor  utilization  or 
efficiency  is  the  speedup  divided  by  p.  The  best  one  can  achieve  is  a  speedup  of  p  and  an 
efficiency  equal  to  one. 

4  Parallel  Tabu  Search  Algorithms 

Due  to  the  succe.ss  and  the  underlying  simplicity  of  the  main  idea  of  tabu  search,  recently 
some  implementations  on  parallel  computers  have  come  up  tailored  to  specific  problems. 
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Surprisingly,  to  the  best  of  our  knowledge,  tliey  are  solely  devoted  to  problems  using  the 
notion  of  paired-attribute  moves:  the  travelling  salesman  problem,  the  job  shop  problem, 
and  the  quadratic  assignment  problem  (compare  Section  5). 

In  a  first  step  we  shall  describe  a  classification  of  different  types  of  parallelism  that 
is  applicable  to  most  iterative  search  techniques  (rf.  VoB  (1992)).  Its  basis  is  the  idea  of 
having  uifTerent  starting  solutions  or  candidate  solutions  (so-called  balls,  motivated  by  the 
idea  of  mountains’  like  solution  space  where  a  ball  is  rolling  to  find  a  stable  low  altitude 
state)  as  well  as  a  number  of  different  strategies,  e.g.  based  on  various  possibilities  of  the 
parameter  setting  or  on  the  tabu  list  management. 

•  SBSS  (Single  Ball,  Single  Strategy) 

The  algorithm  starts  from  exactly  one  given  feasible  solution  and  performs  its  moves 
following  exactly  one  strategy. 

•  SBMS  (Single  Ball,  Multiple  Strategies) 

The  algorithm  starts  from  exactly  one  given  feasible  .solution  by  the  u.se  of  different 
strategics  where  each  strategy  is  performed  on  a  dilferent  processor. 

•  MBSS  (Multiple  Balls,  Single  Strategy) 

The  algorithm  starts  from  different  initial  feasible  solutions,  each  on  a  different  pro¬ 
cessor.  The  same  type  of  instruction,  i.e.  strategy,  is  performed  on  each  processor. 

•  MBMS  (Multiple  Balls,  Multiple  Strategies) 

The  algorithm  starts  from  different  initial  feasible  solutions  performing  different 
strategics. 

In  wiiat  follows  we  discuss  the  above  ideas  in  more  detail  with  special  empiiasis  on 
further  principles  of  parallelism  within  specific  strategies.  1‘or  ease  of  description  we 
2issume  the  notion  of  jjarallel  or  distributed  processing  algorithms. 

SBSS 

The  single  ball,  single  strategy  idea  is  the  simplest  version,  and  obviously  corresponds  to 
die  idea  of  classical  sequential  computations  (cf.  the  SISD-model).  This,  however,  does 
not  restrict  the  possibility  of  parallelization. 

Starting  from  an  initial  feasible  solution,  the  best  move  which  is  not  tabu  must  be 
performed.  The  search  for  this  move  may  be  done  in  parallel  by  decomposing  the  set  of 
admissible  moves  into  a  number  of  subsets,  li.g.  in  a  master-slave  architecture  each  (slave) 
processor  may  evaluate  the  best  move  in  a  specific  subset.  'I’lie  best  move  of  each  subset 
is  communicated  to  the  master  who  picks  the  overall  best  as  the  transformed  solution  and 
also  performs  the  tabu  list  management. 

To  restrict  the  amount  of  comnuinication  necessary  for  synchronizing  the  data  each 
slave  could  determine  the  be.st  possible  move  in  its  subset  without  observing  any  tabu 
list,  while  the  tabu  list  in  the  same  time  is  updated  by  the  master.  Then  the  master  picks 
among  all  answers  the  best  which  is  not  tabu.  If  no  such  move  exists,  a  second  trial  must 
be  made  while  each  proces.sor  has  to  receive  and  to  olwerve  the  tabu  list.  Otherwise  the 
next  iteration  is  to  be  performed. 
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Additional  ideas  may  be  developed  witli  respect  to  the  specific  strategies.  In  'I'NM. 
the  tabu  list  management  may  be  done  by  each  processor  itself  by  simply  providing  the 
most  recent  move  (wlicxse  complement  will  be  in  the  list).  In  CSM,  the  master  Ijuilds  '..h  ' 
cancellation  se<jucnccs  and  partitions  them  to  the  slaves,  i.c.,  every  slave  has  to  ewainatc 
a  certain  number  of  sequences.  In  subsequent  iterations,  the  attributes  of  the  current 
moves  are  communicated.  Whenever  a  cancellation  secuence  is  reduced  to  I  i(  w!ii  be 
re-communicated  to  the  master. 

SBMS 

In  SBMS  each  processor  executes  a  process  whicii  is  one  of  the  above  tabu  searcl;  stir,i,eg!€s 
with  different  tabu  conditions  and  parameters,  iikee.g.  REMt  for  various  t.  Tor  TN.M  this 
can  be  different  (eventually  randomly  modified)  tabu  list  lengths;  for  CSM.  differcTit  tabu 
durations  may  be  considered.  The  (slave)  processors  are  halted  after  a  prespecified  Mme 
and  the  results  arc  compared  and  the  best  one  is  calculated.  A  restart  is  po.ssible  witli  the 
best  or  a  good  seed  solution.  Each  strategy  may  take  a  different  path  through  tiie  search 
space  because  of  different  tabu  list  management  or  parameter  setting.  A  restart  may  be 
performed  either  with  empty  running  and  tabu  lists  or  with  a  previously  encountered  list. 

MBSS 

The  multi|)le  balls  approaches  start  from  at  most  jt  (the  number  of  processors  available; 
different  initial  feasible  solutio/is,  whose  calculation  can  vary.  They  may  be  determined 
either  randomly  or  by  applying  different  heuristics  to  the  same  problem.  Tins  may  aiuo 
incorporate  ideas  involving  different  diversification  and  intensification  strategies  as  des¬ 
cribed  above.  A  third  possibility  assumes  one  given  feasible  solution  and  starts  with  a 
suitable  subset  of  its  transformed  (neighbourhood)  solutions.  (Especially  with  REM2  it, 
may  be  assured  that  even  in  future  iterations  there  is  no  overlap  with  the  initial  feasibie 
solutions  of  the  other  processors.)  The  single  strategy  approach  a.ssumes  the  application 
of  exactly  one  tabu  search  algorithm  with  the  same  parameter  setting  for  all  processors. 

As  with  SBMS,  the  processes  may  be  halted  after  a  specific  lime  period  to  coordinate 
their  results  and  possibly  to  initiate  a  restart  with  new  (hopefully)  improved  solutions. 
If  the  processes  are  performed  synchronously,  then  the  stopping  may  be  initiated  after 
having  generated,  say,  rn  successive  moves.  On  synchronous  MIMD  machines  the  latter 
approach  may  be  especially  relevant.  Note  that  the  above-mentioned  possibility  of  pa¬ 
rallelization  within  SBSS  is  related  to  a  method  with  m  =  1  where  the  best  transition  is 
evaluated.^  With  respect  to  MBSS,  this  modifies  to  the  evaluation  of  the  p  best  moves 
usable  for  a  restart.  For  m  >  2  this  approach  may  be  used  as  a  look  ahead  method. 

MBMS 

The  multiple  balls,  multiple  strategics  approach  subsumes  all  previous  classes,  allowing 
search  within  the  solution  space  from  different  starting  points  with  different  metiiods  or 
parameter  settings. 

^This  gives  reference  to  incorporate  dilTercnt  candidate  list  strategies.  (Note  the  correspondance  lo 
ideas  of  beam  search,  cf.  Glover  (1990).) 
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6  Examples 

the  cequei  we  sketch  some  of  U.e  ideas  Riven  in  tlie  previous  sections  with  respect  to  well 
icnown  combi natoriai  optiml/.a<.ioii  jnubleins.  As  mentioned  above,  we  only  found  some 
ivofk  on  problems  with  tJie  idea  of  paired-attribute  moves  to  perform  tlie  neighbourhood 
sear'U  V/e  start  witli  rospec  le  ijinary  integer  programming,  exploiting  single-attribute 
moves. 

Consider  the  SBSS  concept.  Also  consider  n  decision  variables  in  a  binary  problem 
with  ao  (implicit  or  explicit)  restriction  on  the  number  of  variables  set  to  either  1  or  0.  We 
may  define  simple  ADD-  or  i)UOI*-moves  by  complementing  the  corresponding  entries 
of  the  binary  variables  i,.  Assume  the  existence  of  ri  -|-  2  processors  with  n  -\-2  being 
the  master  processor.  The  tabu  list  management  is  performed  by  processor  n  -I-  1.  In 
any  iteration  of  the  search,  each  of  the  .synchronously  controlled  processors  i  €  {1,. . .  ,»i} 
receives  the  information  who.se  variablc.s’  entry  has  l>een  chosen  to  be  exchanged  as  tlie 
most  recent  move.  I'liis  move  is  performed  together  with  the  reversion  of  ar,.  This  usually 
.'-an  be  done  quite  elficiently  by  reconstructing  the  previous  solution  stored  at  i  with  at 
most  one  assignment  com|>iemented.  I'hen  t  offers  its  objective  function  value  to  the 
'nasi.cr  who  re-calis  all  results  of  proces.sors  referring  to  non-tabu  moves  (evaluated  by 
^..rocessor  u-t- 1).  Obviously  tiiis  approacn  .nay  be  generalix,c<l  in  various  ways  to  the  more 
venerai  classes  described  above. 

This  concept  may  be  applied,  e.g.,  to  the  warehouse  location  problem  (WLP),  to 
oteiner's  problem  in  graphs  (.SP).  and  to  the  multiconstraint  zero-one  knapsack  problem 
(MCKPj.  E.g.,  for  WLl^  this  neighlK>uriiood  search  means  a  reallocation  of  costumers, 

i. e.,  opening  a  new  location  :  results  in  re-allocating  all  costumers  for  which  t  is  closer 
than  the  depot  currently  userl.  (/orrcspondingly,  closing  a  location  i  forces  all  costumers 
receiving  service  from  i  to  its  second  nearest  location. 

.An  even  more  ciialicnging  reoptimization  problem  arises  within  SP.  There,  an  itera¬ 
tion  of  the  neighbourhood  search  may  consist  of  changing  a  node-oriented  binary  variable 

j. iid  calculating  a  minimum  spanning  tree  (MST)  on  the  set  of  all  nodes  with  entry  1  of 
tlie  corresponding  variables.  The  question  is,  whether  reoptimization  may  be  carrietl  out 
either  by  solving  the  modified  problem  anew  or  by  starting  from  a  previous  optimal  solu¬ 
tion  iound  by  the  same  processor  (see  CJlover  ct  al.  (19?)2)  for  a  corresponding  sequential 
approach  with  respect  to  MST). 

If  the  number  or  wcighterl  number  of  variables  with  value  1  is  limited  (as  for  MCKP) 
or  fixed  (as  e.g.  in  the  |)-median  problem)  then  the  same  approach  may  be  applied  with 
combined  ADD/DIlOP-  or  SWAP-inoves  leading  to  paired-attrilmte  moves. 

Malek  et  al.  (1989)  follow  the  SUMS  approach  to  solve  travelling  salesman  problems 
fTSP)  by  TNM  with  the  2-opt  exchange  as  moves.  'I’hc  tabu  attributes  follow  different 
strategies  in  that  they  are  restricted  either  to  one  or  to  the  two  cities  that  have  been 
owapped  or  to  the  cities  and  their  respective  positions  in  tour.  In  addition  different  tabu 
parameters  were  used  on  different  processors.  For  another  parallel  tabu  search  algorithm 
for  tiie  TSP  see  Fiechter  (1990). 

The  quadratic  assignment  (iroblcm  (QAP)  is  treated  by  Cliakrapani  and  Skorin-Kapov 
(1991,  1992)  by  the  use  of  SUSS  and  TNM  with  search  intensification  and  search  diversi- 
iication  performed  sequentially  while  evaluating  the  moves  in  parallel.  The  set  of  moves 
's  oartitioned  into  disjoint  suliscts,  each  one  on  a  different  processor  as  described  above. 
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The  neighbourhood  search  is  performed  by  pairwise  iiilerchanges  such  that  for  Ofrr  ; 
processors  available  all  moves  can  be  evaluated  in  constant  time,  achieving  a  speedup  cf 
C?(n*/ log  n).  Battiti  and  Tecchiolli  (1992)  use  TNM  together  wiih  a  hasiiing  function 
and  compare  their  algorithm  also  with  a  parallel  genetic  algoritlim.  Another  parallel  al¬ 
gorithm  for  QAP  based  on  TNM  (with  randomly  varying  ti^ize;  li«is  been  presented  b% 
Taillard  (1991).  It  is  an  SBSS  approach,  loo.  The  same  idea  has  also  been  applied  to  tlic 
job  shop  as  well  as  to  the  flow  shop  problem  (see  Taillard  (1989,  1990)).  The  latter,  d- 
fact,  also  describes  a  single-attribute  based  implementation  with  attributes  corresponding 
to  objective  function  values.  Chakrapani  and  Skorin-Kapov  (1992)  is  especially  relevant 
since  its  implementation  is  based  on  a  coniiectionisl  approach  related  to  a  Boltzmann 
macliine  (cf.  Aarts  and  Korst  (1989)). 

6  Conclusions 

We  have  summarized  some  ideas  for  developing  parallel  tabu  search  algorithms.  Motivated 
by  a  famous  classification  scheme  for  parallel  machine  models  we  proposed  a  classiflcatici;. 
scheme  for  parallel  tabu  search  algorithms.  While  research  in  this  field  is  still  in  its  infancy 
we  believe  that  reasonable  achievements  in  the  following  two  aspects  will  be  provided. 

•  Development  of  a  framework  for  a  general  parallel  tabu  search  algorithm  that  can 
be  applied  to  a  wide  range  of  combinatorial  optimization  problems. 

•  Empirical  results  for  parallel  tabu  search  algorithms  tailored  to  specific  problems. 

Some  results  known  from  the  literature  (cf.  Section  5)  support  this  feeling.  Despite  Inc 
emphasis  on  parallel  tabu  search,  sequential  testing  is  still  far  from  complete.  In  addition, 
the  tabu  search  metastrategy  should  be  tested  on  different  classes  of  parallel  algoritlmi-s 
and  machine  models.  Especially  relevant  seems  to  be  a  comparison  of  algorithms  tai¬ 
lored  to  different  hardware  specifications  like  vector  computers  versus  synchronous  nnei 
asynchronous  MIMD  maciiincs.  However,  one  should  lake  into  account  identical  user 
specifications  with  respect  to  tabu  search  (e.g.  parameter  setting,  definition  of  the  neigi: 
bourhood).  Note  that  our  classification  scheme  is  not  restriced  to  parallel  tabu  searcli, 
but  may  be  applied  for  nearly  any  iterative  search  procedure,  such  as  simulated  annealing 
or  genetic  algorithms. 
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The  work  of  a  transport  company  (bus,  train,  etc.)  may  be 
represented  by  a  schedule  which  specifies  the  journeys  to  be 
undertaken.  Figure  1  is  a  graphical  representation  of  part  of 
such  a  schedule,  with  each  line  showing  the  times  that  a  service 
begins  and  ends,  and  each  '+*  showing  the  time  of  a  relief 
opportunity  at  which  the  driver  of  that  service  may  be  replaced 
by  another  driver.  An  indivisible  period  which  must  be  worked  by 
the  same  driver  (e.g.  between  two  consecutive  relief 
opportunities)  is  called  a  workpiece. 
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Figure  1  -  Graphical  Representation  of  a  Schedule 
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Each  driver's  working  day  consists  of  a  number  of  workpieces.  A 
complete  specification  of  a  driver's  working  day,  including 
sign-on,  sign-off  and  roealbreak  times,  is  called  a  duty.  Every 
transport  company  has  many  conditions  that  its  duties  must 
satisfy,  usually  called  the  "union  agreement".  This  agreement  may 
specify,  for  example,  the  maximum  length  of  a  working  day  and 
durations  of  mealbreaks.  There  is  usually  a  very  large  number  of 
different  duties  that  could  be  used  to  cover  a  schedule. 
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There  are  several  computer  systems  which  can  be  used  to  determine 
a  set  of  valid  drivers*  duties  to  cover  a  schedule  provided  by  a 
transport  company.  This  paper  will  consider  enhancements  that 
have  recently  been  devised  for  one  such  system  called  IMPACS 
(Integer  Mathematical  Programming  for  Automatic  Crew  Scheduling). 
This  system  was  developed  at  the  University  of  Leeds  by  Wren  & 
Smith! 1]  and  is  now  marketed  by  the  Hoskyns  Group.  IMPACS  has 
mainly  been  used  by  bus  companies  (throughout  the  world)  but  it 
has  also  been  used  by  train  and  tram  companies. 

At  the  heart  of  the  IMPACS  system  is  an  Integer  Programming  model 
which  has  two  pre-emptively  ordered  objectives:  to  minimise  the 
total  number  of  duties  used  to  cover  a  given  schedule  and  to 
minimise  a  cost  function  which  reflects  both  the  wage  cost  and 
undesirable  features  of  duties.  The  model's  constraints  ensure 
that  all  workpieces  are  covered  at  least  once,  with  some 
specially  selected  workpieces  being  covered  exactly  once.  Also, 
each  duty  is  classified  according  to  its  type  (e.g.  early,  late, 
overtime)  and  side  constraints  can  be  added  which  limit  the 
number  of  duties  of  any  type  that  are  to  be  used. 

Thus,  the  model  is  of  the  mixed  set  covering/partitioning  type, 
possibly  with  the  addition  of  side  constraints.  Ongoing  research 
attempts  to  exploit  further  the  special  structure  of  the  IMPACS 
model  and  to  take  advantage  of  recent  developments  in 
mathematical  programming  algorithms. 
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The  IMPACS  model  has  previously  been  solved  using  the  following 
four -stage  process.  For  the  first  three  stages,  the  Linear 
Programning  relaxation  of  the  model  is  used. 

Stage  1  Minimise  the  total  number  of  duties  using  a  Primal 
Simplex  algorithm. 

Stage  2  Add  a  constraint  which  ensures  that  the  integral  number 
of  duties  does  not  increase  and  minimise  the  cost 
function  using  a  Primal  Simplex  algorithm. 

Stage  3  If  the  total  number  of  duties  is  not  integral,  add  a 
suitable  constraint,  and  reoptimise  using  a  Dual 
Simplex  algorithm. 

Stage  4  Detemane  an  integer  solution  using  Branch  and  Bound 
techniques  with  constraint  branching. 

Optimisation  within  the  IMPACS  system  is  based  on  Ryan's  ZIP 
package [2].  The  performance  of  this  package  has  been  improved  by 
incorporating  Goldfarb  &  Reid’s  Primal  Steepest  Edge  algorithmO) 
and  a  Dual  Steepest  Edge  algorithm  due  to  Forrest  &  Goldfarb(4] . 

This  paper  will  consider  a  new  strategy  for  solving  the  Linear 
PrograEsning  relaxation.  Enhancements  to  stage  4  are  the  subject 
of  separate  work. 


Each  of  stages  1  and  2  of  the  previous  strategy  typically  involv 
a  large  number  of  iterations,  resulting  in  the  time  to  solve  the 
Linear  Progranming  relaxation  being  a  significant  proportion  of 
the  total  solution  time.  This  is  due  to  the  objectives  for  stage 
1  and  2  being  different  and  the  high  degree  of  degeneracy 
inherent  in  the  model.  Also,  the  constraint  that  is  added  at 
stage  2  is  fully  dense,  and  this  substantially  increases 
iteration  timings. 

These  difficulties  have  been  addressed  by: 

1.  Using  a  single  weighted  objective  function, 
and  2.  Solving  the  resulting  model  using  a  Dual  Steepest  Edge 
algorithm. 

The  weight  that  is  used  to  combine  the  two  objectives  is 
relatively  small,  and  is  determined  by  applying  an  algorithm  due 
to  Sherali(51  to  the  IMPACS  model.  To  initiate  the  Dual  Simplex 
algorithm,  an  heuristic  has  been  developed  to  produce  initial 
basic  dual  feasible  solutions. 

The  paper  will  conclude  with  the  presentation  of  computational 
results  for  real  world  problems  with  numbers  of  constraints  in 
the  range  125  to  450  and  numbers  of  variables  in  the  range  400C 
to  11000.  The  results  suggest  that  the  new  strategy  significant! 
reduces  solution  timings. 
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This  talk  will  begin  by  discussing  duality  in  Mathematics  in  a  wider 
context  e.g.  in  the  areas  of  Set  Theory  and  Logic,  Projective  Geometry  and 
Convex  Polytopes.  Some  of  the  mathematical  properties  which  are  normally 
expected  of  a  dual  will  be  listed  e.g.  Reflexivity  and  Symmetry.  Linear 
Programming  (LP)  and  Congruence  duality  will  then  be  examined  for  both  its 
mathematical  properties  and  computational  and  economic  uses  e.g.  Proving 
Optimality,  Sensitivity  Analysis  and  Pricing  Imputation. 

A  number  of  possible  Integer  Programming  (IP)  duals  will  be 
mentioned  e.g.  the  Gomory-Baumol  dual,  Lagrangean  dual  and  Surrogate 
dual.  They  all  lack  some  of  the  above  properties  and  in  particular  do  not 
provide  a  guaranteed  proof  of  optimality. 

It  will  be  suggested  that  the  most  satisfactory  dual  arises  from 
examining  the  Value  Functions  and  Consistency  Testers  of  IPs.  For  Pure 
IPs  (PIPs)  these  take  the  form  of  Gomory  Functions.  Gomory  functions  are 
built  up  by  the  repeated  applications  of  the  operations  of 


(i)  Non-negative  linear  contributions. 

(ii)  Integer  round-up. 

(iii)  Taking  Maxima. 
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Tnese  can  be  expressed  in  the  form 

Max(Ci.  C2,  Cn)  (1) 

where  the  Cj  are  Chvdtat- Functions  which  are  built  up  from  operations  (i) 
and  (ii). 

By  comparison  the  Value  function  of  the  Consistency  Tester  of  the 
corresponding  LP  relaxation  will  involve  functions  of  the  form 

Max(C,,  C2.  Cr)  (2) 

v/here  r  n  and  Cj  is  obtained  from  Cj  by  dropping  the  operation  (ii). 

The  Cj  v/ill  therefore  be  non-negative  linear  conbinations  of  the  right- 
hand-side  coefficients,  arising  from  the  dual  vertices  of  the  LP. 

It  will  be  shown  that  those  Cj  which  correspond  to  a  Cj  in  (2)  can  be 
obtained  by  finding  the  Value  function  of  PIPs  over  cones.  This  may  be  done 
by  obtaining  the  Hermite  Normal  Form  of  the  corresponding  basis  matrix  for 
the  LP  relaxation.  The  resulting  doubly  recursive  function  of  the  right-hand- 
side  coefficients  gives  the  Value  function  (and  Consistency  tester).  It  is 
suggested  that  the  depth  of  this  recursion  is  a  measure  of  complexity.  The 
problem  of  extending  this  method  to  give  the  Value  function  and  Consistency 
tester  for  a  general  PIP  will  be  considered. 

It  will  be  shown  that  the  Value  function  for  a  Mixed  IP  (MIP)  is  not 
generally  a  Gomory  function  although  the  Consistency  tester  is.  By 
incorporating  this  objective  as  a  constraint  and  finding  the  consistency  tester 
of  this  system  it  is  then  possible  to  characterise  the  Value  function  of  the 
MIP. 

The  Value  function  for  certain  MIP  applications  has  considerable  economic 
importance  since  it  shows  how  indivisible  resources  should  be  "priced".  This 
aspect  will  be  considered  in  relation  to  the  Fixed  Charge  Problem  and  the 
Power  Systems  Loading  Problem. 


1  General  Problem  Description 


Analysts  frequently  f<Lce  the  following  problem;  given  a  multivariate  (possi¬ 
bly  correlated)  population,  how  does  one  determine  a  good  estimate  of  the 
probability  function  (or  some  number  of  its  moments)  for  a  complicated  func¬ 
tion  of  the  population’s  variables?  The  primary  problem  to  consider  then  is 
what  is  the  most  efficient  way  to  sample  from  the  input  population,  espe¬ 
cially  when  sampling  is  extremely  expensive  and  must  therefore  be  limited 
to  a  predetermined  (small)  sample  size.  The  desire  is  to  generate  a  sampling 
plan  which  will  be  representative  of  the  population,  and  produce  estimates 
of  moments  which  have  desirable  statistical  properties.  However,  since  the 
larger  the  sample,  the  larger  the  cost,  there  is  a  trade-off  between  generating 
the  best  estimates  and  reducing  the  amount  of  sampling.  In  order  to  obtain 
better  estimates  from  sampling,  analysts  may  determine  them  by  using  data 
collected  from  a  stratified  sampling  of  the  population. 

A  special  form  of  stratified  sampling  is  latin  hypercube  sampling. 
In  this  stratification,  the  cumulative  distribution  function  for  each  of  the 
n  population  variables  is  divided  into  m  blocks.  The  intersection  of  these 
blocks  makes  up  a  hypercube  having  m**  cells.  If  all  m’'  cells  were  sampled, 
the  sampling  approach  would  be  a  “full  factorial  design”.  Since  sampling 
is  assumed  to  be  expensive,  LHS  limits  the  sampling  to  only  m  of  the  m" 
possible  cells.  Thus,  a  LHS  plan  is  not  a  hypercube,  but  is  equivalent  to  a 
m  X  n  matrix  such  that  each  of  the  m  rows  defines  one  sampling  cell  of  a  m” 
hypercube. 

The  t'*  row  of  a  LHS  sampling  plan  makes  up  what  will  be  referred 
to  as  “run  i”.  Defining  this  grouping  as  a  run  is  motivated  by  the  fact 
that  typical  applications  of  LHS  involve  computer-based  models  where  the 
number  of  runs,  m,  is  predetermined.  To  ensure  that  a  plan  offers  a  cross 
section  of  the  sampling  space,  an  additional  feature  of  LHS  is  that  each  block 
of  each  variable  must  be  picked  once.  Thus,  each  column  of  a  LHS  plan  is  a 
permutation  of  the  numbers  1  to  m. 
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Combinatorial  optimization,  by  its  broad  nature,  has  been  used 
to  model  and  solve  a  variety  of  problems  including  those  arising  in  decision, 
engineering,  and  physical  sciences.  The  focus  of  this  work  is  to  consider 
the  solution  of  a  sampling  design  problem  using  combinatorial  optimization. 
The  particular  design  problem  of  interest  here  is  minimum-correlation  iatin 
hypercube  sampling  (hereafter  referred  to  as  MCLHS).  The  central  point  of 
this  research  is  the  development  of  combinatorial  optimization  procedures 
which  provide  MCLHS  plans.  This  is  an  entirely  new  approach  for  finding 
MCLHS  plans. 

We  introduce  integer  programming  (IP)  formulations  of  this  problem 
and  develop  a  procedure  for  determining  minimum-correlation  sampling  de¬ 
signs.  We  provide  the  obvious  IP  formulation  of  the  MCLHS  problem  which 
results  in  a  problem  having  an  exponential  number  of  variables  and  a  large 
(polynomial)  number  of  constraints.  We  then  transform  the  problem  into 
a  sequence  of  assignment  problems  with  side  knapsack  equations,  having  a 
polynomial  number  of  variables.  This  decomposition  was  found  by  exploit¬ 
ing  the  special  structure  of  the  problem  and  finding  tight  objective  function 
lower  bounds.  We  note  that  even  after  the  decomposition,  the  problem  still 
belongs  to  the  NP-hard  class.  Although  the  decomposition  and  subsequent 
development  of  solution  procedures  for  the  smaller  problems  are  discussed 
within  the  context  of  the  sampling  design  problem,  the  approach  may  be 
applicable  to  various  permutation-related  IP  problems  such  as  the  general 
quadratic  assignment  problem,  assignment  problems  with  side  constraints, 
and  the  asymmetric  travelling  salesman  problem  variation  where  the  objec¬ 
tive  is  to  find  a  tour  which  meets  a  specific  cost  value.  Thus,  while  the 
research  presented  here  focuses  on  solution  approaches  for  the  MCLHS  prob¬ 
lem,  the  general  theory  and  findings  might  well  prove  useful  for  the  solution 
of  other  problems  known  to  be  NP-complete. 

We  begin  with  a  description  of  the  general  LHS  and  MCLHS  problems, 
followed  by  integer  programming  formulations  and  a  discussion  of  optimiza¬ 
tion  procedures  developed. 
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To  describe  the  standard  approach  to  LHS,  we  begin  by  writing  the 
vector  of  variables  as  {Xi ,  X-2, X„)  and  assume  for  the  time  being  that  the 
variables  are  mutually  independent.  The  range  of  each  Xi  is  divided  into  m 
(=  number  of  runs)  ascending  intervals  of  equal  probability  and  a  random 
value  is  drawn  on  each  interval  foe.  each  variable.  Next,  we  generate  the  order 
in  which  the  m  values  of  each  variable  are  to  be  used  in  each  run  by  creating 
a  sequence  of  n  random  permutations  of  the  integers  1  to  m.  Finally,  we 
form  the  required  vector  for  the  run  by  taking  the  t*"*  number  from  each 
of  the  n  random  permutations. 

Latin  hypercube  sampling  plans  generated  by  the  standard  approach 
are  restricted  only  in  the  sense  that  for  each  variable,  a  value  must  be  picked 
once  and  only  once  from  each  of  its  m  intervals.  A  point  we  have  not  yet 
considered  is  the  impact  that  correlations  between  the  columns  of  a  LHS  sam¬ 
pling  plan  may  have  on  the  generated  estimates.  For  ease  of  explanation,  we 
will  continue  the  assumption  that  the  population  variables  are  mutually  in¬ 
dependent,  although  similar  results  are  obtained  for  any  given  population 
covariance  matrix.  For  the  n  variables,  although  their  sampling  plan  permu¬ 
tations  are  determined  independently,  a  standard  LHS  plan  will,  in  general, 
have  some  level  of  correlation  between  the  pairs  of  permutations.  Thus,  the 
sampling  plans  will  not,  in  general,  parallel  the  correlations  of  the  true  joint 
distributions.  If  LHS  sampling  is  done  without  concern  for  the  correlation 
pattern  (or  lack  thereof),  the  estimators  cannot  be  guaranteed  to  be  unbiased 
or  even  consistent. 

The  desire  then  is  to  design  LHS  plans  which  incorporate  the  vari¬ 
ables’  true  pairwise  correlations.  For  two  variables,  X,  and  Xj,  with  distri¬ 
bution  functions  having  strictly  positive  standard  deviations,  Cj  and  <7j,  the 
correlation  coefficient  between  the  variables  is  defined  as 

cov(A'i,Xj) 

~  (TiCTj 

where  cov(A’,-,  A’j)  denotes  the  covariance  between  variables  A",  and  Xj. 
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To  approximate  the  pairwise  correlation  coefl&cients  pij,  we  will  con¬ 
sider  the  correlation  coefficients  between  the  pairs  of  LHS  plan  permutations 
aissociated  with  variables  Xi  and  Xj.  (The  two  forms  of  correlations  are  equal 
when  Xi  and  Xj  are  both  uniformly  distributed.)  For  permutations  of  the 
integers  from  1  through  it  can  be  shown  that  the  correlation  coefficient 
of  the  indices  of  any  pair  of  permutations  is 

-  _  1  _  »  iOu=l 

m{m}  —  1)  ’ 

where  £)„  is  the  difference  between  the  integer  elements  in  the  vectors. 
This  is  known  ^ls  the  Spearman  rank  correlation  coefficient  and  can  take  on 
values  in  the  interval  [—1,1].  The  expected  value  of  the  rank  correlation 
coefficient  is  0,  and  its  variance  is  l/(m  —  1).  Throughout  the  remainder 
of  this  paper,  we  denote  the  rank  correlation  estimate  between  the  column 
permutations  of  variables  Xi  and  Xj  by  f,j. 

For  illustration,  suppose  we  want  to  run  a  model  with  three  mutually 
independent  uniformly  distributed  variables  (for  simplicity,  x,  y,  and  z), 
each  to  be  represented  by  values  chosen  from  their  respective  sample  spaces. 
Assuming  further  that  we  are  allowed  only  six  runs,  consider  the  LHS  plan 
given  below: 

Table  1:  Latin  Hypercube  Sampling  Example 


Model  Run 

Variable  Values 

1 

yi 

xs 

2 

X2 

ye 

X3 

3 

X3 

ys 

Z4 

4 

y3 

5 

xs 

y2 

Z2 

6 

xe 

y4 

ze 

The  rank  correlation  coefficients  for  this 

examp! 

le  are 

ri2  =  0.00,  r23  =  0.00,  f,3  =  0.00, 

and  hence,  it  appropriately  models  the  mutual  independence  of  the  three 
variables.  If,  for  example,  the  variables  were  dependent  with  true  joint  dis¬ 
tributions  having  pairwise  rank  correlations  of  say,  ri2  =  —.6,  r23  =  —.42, 
and  ri3  =  .14,  then  this  particular  sampling  plan  would  not  suitably  parallel 
these  true  rank  correlations. 


The  objective  of  the  restricted  LHS  problem  we  consider  is  the  selec¬ 
tion  of  column  permutations  which  attempt  to  meet  exactly  the  true  rank 
correlations  associated  with  the  variables.  In  this  way,  sampling  is  intended 
to  match  more  closely  the  true  marginal  distributions  of  the  input  vari¬ 
ables.  Specifically,  the  minimization  problem,  called  minimum-correlation 
LHS  (MCLHS),  provides  a  sampling  specification  minimizing  the  sum  of  the 
absolute  values  of  the  pairwise  differences  (f,j  —  r,j).  In  much  of  the  discus¬ 
sion  however  we  will  minimize  the  sum  of  the  absolute  values  of  the  pairwise 
rank  correlations  f,j.  This  models  the  situation  when  independence  of  the 
variables  is  likely  (i.e.,  r,j  =  0). 

2  Integer  Programming  Models  for  MCLHS 

The  minimum-correlation  latin  hypercube  sampling  problem  described  can 
be  formulated  as  a  n-index  assignment  problem  with  side  knapsack  equation 
constraints  (APSEC).  To  begin,  define: 

1  if  Uit;2  •  •  •  is  a  sampled  cell 

where  the  n-indices  on  the  x-variable, 

Ui,  U2, .  • . ,  Uni  can  each  take  a  value  from  1  to  m 
0  otherwise 

and  also  dfj,  d~j  €  such  that: 

-  (uii  -  rij)  m{m}  -  l)/6  t  =  1 . . .  n,  j  >  i. 

Thus,  dfj  and  dij  are  the  positive  and  negative  magnitudes  of  the  devia¬ 
tions  from  the  true  rank  correlation  of  the  rank  correlation  between  column 
permuations  i  and  j. 

Equivalent  to  minimizing  the  sum  X^"=2  Li=i  i  !i  is 

minimizing  the  objective  function 

^  <=1  j>i 
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Although  the  formulations  described  below  are  applicable  to  cases 
with  nonzero  r,,,  for  ease  of  presentation,  we  will  assume  r,,  =  0.  It  can  be 
shown  that 

m(m^  —  l)r/6  =  m(m^  —  l)/6  —  D^, 

will  be  integer- valued  for  all  pairs  of  permutations.  Thus,  we  can  now  define 
dfj,  d~j  6  such  that: 

d'lj  —  d~-  =  m(m^  —  l)r,j76  t  =  1  . . .  n,  j  >  i. 

In  order  that  the  IP  formulation  fully  encompasses  the  MCLHS,  it 
must  include  assignment  constraints  that  draw  a  one-to-one  correspondence 
between  the  positive- valued  of  a  feasible  solution  and  n-tuples  of  col¬ 

umn  permutations.  Thus,  the  column  permutation  requires  the  m  assign¬ 
ment  constraints 

mm  m 

X)  ■  ■  ■  Z)  =  1  Uj  =  1  . . .  m 

ujsl  Vj=l  Vn=l 

excluding 

Additional  constraints  are  needed  to  enforce  that 

m(m^  —  l)r,j/6  =  dfj  —  holds  for  all  i  and  i  <  j  <n.  These  constraints 
are 


mm  m 

Z  Z  •  •  •  Z  (^«  “  =  m{m^  -  l)/6  V  i  <  j  <  n 

V1=1  Vj  =  l  Vn  =  l 

In  addition  to  belonging  to  the  class  of  NP-complete  problems,  we  see 
that  this  formulation  requires  m"  x- variables  as  well  as  a  total  of  n(n  —  1) 
deviational  (d,j,  d“)  variables.  There  are  nm  assignment  constraints  and 

constraints  to  ensure  that  m{m^  —  l)r,;76  =  dfj  —  d~y  Hence,  although 
this  formulation  is  the  most  straightforward,  we  will  present  other  APSEC 
formulations  which  have  more  re2isonable  problem  size  growth. 
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To  develop  alternative  formulations,  we  use  the  objective  function 
lower  bound  of  (2j6/(m(m^  —  1))  when  m  =  2  +  4/  for  some  nonnegative 
integer  /,  and  zero  otherwise,  and  make  the  assumption  that  given  k  —  \ 
column  permutations  with  minimum  1  is  possible  to  fix 

these  columns  and  find  an  optimal  column  permutation.  Our  research 
and  empirical  results  have  shown  that  these  are  valid  assumptions. 

Suppose  we  have  a  solution  to  the  {k  —  l)-dimensional  problem,  and 
wish  to  use  this  solution  to  obtain  a  solution  to  the  fc-dimensional  problem. 
Let  •  •  •  ,p^)  denote  the  corresponding  column  permutation  vectors, 

and  define 

1  if  the  element  of  column  k 
Xij  =  <  is  assigned  value  j 
0  otherwise 


To  ensure  that  column  A:  is  a  permutation  of  numbers  1 . . .  m  ,  we  add  the 
assignment  constraints  : 


=  1 


;  =  l,...,m 


i  =  l,...,m. 


We  see  that  the  positive  elements  of  an  i-solution  define  a  column.  We 
will  henceforth  interchangably  use  the  terms  “an  x-solution”  and  “the  k^^ 
column  defined  by  the  positive  elements  of  x” . 

There  are  (A:  —  1)  additional  constraints  of  the  following  form: 

(1)  4 (  = 

where  p\  is  the  t‘'‘  entry  of  the  column  permutation  vector  p‘.  With  these 
constraints,  we  implicitly  fix  the  (A:  —  1)  previously  found  column  permuta¬ 
tions. 
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The  formulation  defined  thus  far  with  objective  function 

min  '^idtk  + 

>=1 

is  a  general  formulation  for  finding  a  colunrn  permutation,  having  hxed 
the  (A:  —  1)  column  permutations  that  minimize  iCj>i  I  ^ij  I-  Empirical 
evidence  strongly  supports  that  there  exists  a  A:‘^  column  that  meets  the 
lower  bound  for  |  r,fe  |,  t  =  1, . . . ,  A:  —  1.  Hence,  there  will  exist  a  solution  to 
the  assignment  constraints  that  generates  a  A:‘*  column  satisfying 

1  m(m2  -  l)rik/6  \  = 


I  m(m2-  l)/6- 


1  if  m  =  2  +  4/,  /  € 
0  otherwise 


for  all  i  =  I, . . .  ^  —  1.  To  incorporate  this  into  the  formulation,  we  require 
that  dfi^  and  d^,  t  =  1, ...  A:  —  1  be  binary  variables.  For 
m  ^  6  +  41,  /  €  2i,  any  solution  that  obtains  the  lower  bound  must  have 
^tk  +  ^7k  —  however,  m  =  2  +  4/,  /  €  we  can  conclude  that 

^tk'^^tk  =  ^  t  =  1,...A;  —  1.  In  either  case,  the  problem  can  be  restated 
as  a  feasibiliiy  problem  with  no  objective  function.  We  shall  refer  to  this 
feasibility  assignment  problem  with  side  equations  as  FASE. 

The  FASE  formulation  follows  the  conjecture  that  one  can  itera¬ 
tively  solve  A:-dimensional  problems  using  the  previously  determined  (A:  —  1)- 
dimensional  solutions.  Thus,  rather  than  solving  one  large  APSEC  program 
with  m"  -I-  n(n  —  1)  variables  and  nm  -|-  constraints,  one  could  solve  a 
sequence  of  smaller  two-dimensional  FASE  problems  with  at  most  m?  -|-  2k 
variables  and  2m  -f  k  constraints  (  2  <  A:  <  n  ). 

In  the  presentation,  we  shall  discuss  heuristic  and  Lagrangean- based 
solution  procedures  developed  to  solve  the  MCLHS  problem  and  its  equiva¬ 
lent  formulation  FASE. 
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Abstract 

This  paper  describes  a  standard  for  the  use  of  GAMS  2.25  as  an 
object-oriented  modeling  language.  The  over-riding  benefit  of  using 
this  technique  is  the  ease  with  which  many  individuals  can  simulta¬ 
neously  develop  extraordinarily  complex  modeling  systems.  Lesser, 
but  still  important  benefits  include:  structured  user-interface  design, 
plug-in/plug-out  models,  isolating  portions  of  the  problem,  easy  main¬ 
tenance  and  updates,  and  model  re-use.  Simultaneous  model  devel¬ 
opment  stems  from  the  latter  benefits,  while  all  of  these  advantages 
derive  from  the  clear,  rigorous  organization  of  your  model  as  specified 
in  the  following  standard. 

We  present  the  concepts  of  encapsulation  (forming  objects)  and  hi¬ 
erarchical  modeling  in  the  context  of  mathematical  modeling.  Encap¬ 
sulation  is  a  well-known  programming  technique  that  is  newly  applied 
to  modeling,  and  our  version  of  hierarchical  modeling  differs  slightly 
from  past  notions.  Traditionally,  a  hierarchical  model  embodies  the 
concept  of  forming  larger  models  from  a  collection  of  sub-models.  The 
following  method  is  based  on  a  partition  of  the  relations  (equations)  of 
the  model,  where  the  elements  of  the  partition  are  partially  ordered. 


1  Overview 

Object-oriented  modeling  (OOM)  is  a  method  of  modeling  that  closely  im¬ 
itates  object-oriented  programming  (OOP)  [?,?,?].  We  have  developed  a 
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standard  for  using  GAMS  2.25  [?]  as  an  OOM  language.  The  difference  be¬ 
ing  that  the  00  models  are  much  more  structured  and  abstract.  This  makes 
them  more  user  friendly  because  their  use  is  well  defined  by  the  structure 
and  their  details  are  hidden  within.  00  Models  thus  appear  simpler  and 
more  uniform  to  the  user.  '- 

Four  essential  properties  set  OOM  apart  from  standard  GAMS  2.25: 

Routines:  Structuring  the  assignment  statements  into  procedures  as  in  Pas¬ 
cal. 

Encapsulation:  Combining  data  and  variables  with  the  equations  and  as¬ 
signment  statements  that  manipulate  them  to  form  a  new  data  type — a 
model. 

Information  Inheritance:  Defining  a  model  that  uses  other  models  in  its 
formulation,  with  each  sub-model  inheriting  the  information  from  its 
ancestors.  The  use  of  models  within  models  defines  the  use  hierarchy 
which  forms  a  partial  ordering  of  all  used  models. 

Polymorphism:  Giving  a  model’s  routine  one  name  that  is  shared  by  all 
descendants  in  the  use  hierarchy,  with  each  descendant  implementing 
the  routine  in  a  way  appropriate  to  itself. 

Routines  are  implemented  using  the  $INCLUDE  statment.  Encapsulation, 
inheritance,  and  polymorphism  are  implemented  in  GAMS  2.25  through  self- 
discipline.  The  following  is  a  detailed  discussion  of  the  principles  and  im¬ 
plementation  of  OOM  in  GAMS  2.25  through  self-discipline.  We  hope  that 
the  future  will  bring  the  language  extensions  need  for  a  proper  implementa¬ 
tion.  In  which  case,  the  standard  described  below  would  be  enforced  by  the 
compiler. 

There  are  now  a  variety  of  experimental  modeling  languages  offering 
object-oriented  features,  notably  ASCEND  (?)  and  MODEL. LA  [?].  We  of¬ 
fer  a  form  of  inheritance  that  differs  from  the  class  inheritance  of  standard 
OOP  and  OOM  languages.  This  is  an  extra  restriction  placed  on  our  models 
based  on  deferred  requirements,  and  the  use  of  models  within  other  models. 
Data  and  variables  are  legated  (passed  down)  to  the  descendants,  while  meth¬ 
ods  are  used  by  ancestors  to  ensure  that  deferred  information*  is  properly 
defined. 


'  Data  and  variables  that  have  been  declared  but  are  yet  undefined. 
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There  is  a  restricted  form  of  communication  control  between  the  rnooeit,  o*' 
the  use  hierarchy.  Essentially,  descendants  c.m  inspect  ancestor  inforu^aiicn 
but  ancestors  can  only  ask  that  certain  information  be  provided.  In  this  way, 
siblings  communicate  through  the  parent,  and  its  deferred  information. 

We  further  expound  on  these  -concepts  and  offer  a  full  accounting  of  the 
presentation.  First  we  introduce  a  model  and  how  it  is  encapsulated.  This 
leads  to  an  overview  of  traditional  hierarchical  modeling.  Then  we  explain 
how  OOM  fits  into  this  background.  The  final  section  gives  the  standard 
itself — how  to  implement  OOM  in  GAMS  2.25. 
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EXTENDED  ABSTRACT 


1.  Motivations  for  a  formal  theory. 

The  definition  of  a  specific  model  is  often  conceived  as  a  work  which  has  to  be  done 
from  scratch.  In  fact,  the  variety  of  the  variables  describing  the  modelled  reality  seems  to 
exclude  the  possibility  that  a  model  can  be  defined  assembling  pieces  of  correlated  sub¬ 
models.  To  define  models  from  scratch  greatly  decreases  the  productivity  of  the  work. 

It  seems  that  the  keyword  in  increasing  modelling  productivity  is  "reusability''.  Models 
can  be  reused  and  integrated  so  to  produce  new  models.  Naturally  models  to  be  integrated 
have  to  be  expressed  using  a  common  base  and  the  result  has  to  lie  on  the  same 
framework.  In  this  paper  the  chosen  framework  is  the  Structured  Modeling,  as  formally 
defined  by  Geoffrion,  [3]. 

Here  we  define  three  integration  levels,  according  to  the  degree  of  influence  of  the 
operator  in  the  procedures  used  to  merge  the  models: 

Level  1  •  All  the  procedures  are  automated.  This  means  that  the  user  selects  the 
input  models  and  the  genera  to  be  integrated,  and  the  output  integrated 
model  is  automatically  produced. 

Level  2  •  The  user  selects  the  input  models  and  the  order  of  integration  among 
the  genera,  and  the  output  integrated  model  is  automatically  produced. 
Level  3  •  The  user  select  the  input  models,  the  genera  to  be  integrated  and 
formulate  the  steps  necessary  to  integrate.  The  output  integrated  model 
is  not  automatically  produced,  since  the  steps  can  vaiy  according  to  the 
situation. 

2.  Preliminary  results. 

In  the  rest  of  this  paper  we  assume  that  the  reader  is  familiar  with  the  formal  theory  of  the 
Structured  Modeling. 

Given  a  Structured  Model  Mi,  let  Gj  =  |gj,  j  =  1 . k)  be  the  set  of  all  the  genera: 

this  can  be  partitioned  into  three  disjoined  sets:  PC,  A  and  FT  such  that: 

PC  =  ( gj  €  Gi:  gj  is  a  primitive  or  a  compound  entity  genus ) 

A  =  { gj  €  Gj;  gj  is  an  attribute  genus ) 

FT  =  {gj  e  Gj:  gj  is  a  function  or  a  test  genus  | . 

Lemma  1:  Any  genus  gjc  PCj  does  not  have  r Terences  to  any  other  genera  g*  e  (A  v 
FT) 
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Proof:  Primitive  entity  elements,  by  definition,  have  no  calling  sequence,  therefore  they  do  not  have 
references  to  any  other  element:  compound  entity  elements,  by  definition,  are  construct  only  on  pnmitive 
entity  elements.  ■ 


Lemma  2:  Any  genus  gj  e  Ai  has  only  references  to  another  genera  gk  e  PCi. 

Proof:  Attribute  elements,  by  definition,  charactenze  only  pnmiuve  and  compound  elements.  ■ 

Lemma  3:  Any  genus  gi  e  FT ,  does  not  have  references  to  any  other  genera  g/^  e  PC-. 

Proof:  Function  and  test  elements  call,  by  definition,  attribute,  function  and  test  elements:  therefore, 
they  cannot  call  primitive  and  compound  entity  eiements.> 

Definition  1:  Connected  Module,  Sub-Model. 

A  module  is  a  Connected  Module  if  its  genera  and  their  calling  sequences  define  n 
connected  graph.  A  Sub-model  is  a  connected  module  with  ai  least  one  primitive  entir. 
genus. 

Definition  2:  Behaviour  Equivalence  on  FTj  c  FT;  . 

Two  structured  models  M  /  and  M2  are  Behaviour  Equivalent  on  FTi  ^  FT;  anu 
FT 2  CPTg  if  the  following  two  conditions  hold: 

a)  The  set  A;  of  the  attribute  genera  directly  or  indirectly  called  by  the_^  e  FT],  and  tm 

set  A  2  of  the  attribute  genera  directly  or  indirectly  called  by  the  gj  e  FT  )  have  the  same 
structure;  _ 

b)  FT]  and  FT 2  give  as  output  the  same  values. 

We  shortly  write  “behaviour  equivalent”  when  the  sub-set  FTj  coincides  with  FTi. 

Definition  3:  Normal  Model. 

A  model  is  called  normal  if  an  isomorphic  relation  exists  between  attribute  and 
compound  genera,  and  their  elements. 


The  graph  of  the  elements  of  a  normal  model  is  shown  in  figure  1;  dotted  rectangles 
identify  genera. 
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Proposition  1.  Given  a  Structured  Model  M;,  it  is  always  possible  to  construct  a 
normal  model  NlMjl  which  is  behaviour  equivalent  to  M;. 

Proof:  Let  us  consider  a  generic  aimbuie  genus  gj  €  Aj  c  M|.  It  is  always  possible  to  define  a  new 
compound  entity  genus,  ck  c  PCj.  with  the  same  calling  sequence  of  gj.  Lemmas  1  and  2  ensure  that 
genera  which  are  called  by  an  attribute  genus  can  be  called  by  a  compound  entity  genus  too.  An 
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isomocphic  relation  can  be  set  among  the  elements  of  gj  and  qc:  the  fust  element  of  gj  calls  the  first 
eiemem  of  ck,  etc.  This  process  is  tepeated  for  every  attribute  genus  of  Mi. 

•f  we  indicate  with  NfMj)  the  modified  model,  the  set  B  =  fck.  gj)  c  N(Mi) ««  gj  e  Mj  for  every  genus 
gk  e  FTi  c  N(Mi).  a 

Definition  4:  Index  Basis,  Index  Basis  Set. 

An  Index  Basis  of  a  normal  model  N(Mi)  is  a  couple  of  genera  Bj  =  {aj,  cj},  where  aj 
?  A;  c  Mi  is  an  attribute  genus,  and  cj  is  the  compound  entity  genus  called  by  aj.  The 
genus  aj  is  called  value  component  of  Bj,  while  the  genus  cj  is  called  index 

r-omponent.  The  set  BSi  =  {Bj,j:I . k}  containing  all  the  index  basis  of  N( Mi)  is 

died  Index  Basis  Set. 

Definition  5:  Index  Function. 

\n  Index  Function  i(gj)  is  a  rule  which  associate  to  every  genus  gj  e  N(Mi)  the 
cardinality  of  its  index  set. 


AS  example,  given  a  genus  gi  indexed  by  jxkxl  .its  index  function  i(gi)  retunis  as  value  3. 

3.  Main  results. 

In  this  caragraph  we  try  to  give  an  example  for  each  level  of  integration  previously 
defined. 

S.I.  Level  I  integration  example. 

To  show  the  first  level  of  integration  we  need  to  introduce  the  definition  of  a  function 
sub-model. 

Definition  6:  Function  Sub>Model. 

SubMlf)  is  called  Function  Sub~model  if  the  following  properties  hold: 

a)  SubM(f)  is  a  normal  model. 

b )  SubM(f)  has  at  least  a  function  genus  fe  FTi  c  M/  indexed  as  singleton. 

In  the  following  we  give  a  procedure,  which  transforms  a  Structured  Model  Mi  with  at 
least  one  function  genus  indexed  as  singleton  into  a  function  sub-model. 

The  following  procedure,  CREATE_FUNCT10N_SUBM0DEL.  needs  as  input  a 
model  Mi  and  a  singleton  genus  f  e  FTi  c  Mi,  and  produce  as  output  a  function 
sub-model.  The  proof  of  this  is  given  in  Proposition  2. 

aroeadura  CREATE_FUNCTION_SUBMOOCl,  (input;  Mi.  t;  output;  SubM(f)l: 

/•  itodify  Mj  into  a  (unction  sub-nodal  SubM((l  •/ 

ivagin 

/*  atap  1.  'Nomalize  the  nodal*  */ 

NORW.L  IMi).- 

/•  sCap  II.  "Merge  functions*  */ 

Create  a  LIST  of  calling  sequence  segments  Si  of  f; 

zapaac 

Examine  the  segment  Si  6  LIST: 

If  the  referred  genus  gx  €  FTi 

than 

/*  a  */  Substitute  Si  with  the  calling  sequence  of  gx  ; 

/*  b  */  Substitute  the  value  field  of  gx  with  its  rule; 

/*  c  */  Delate  gx; 

Delete  the  segswnt  s^; 
until  (end  of  LIST): 
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/*  stap  III  •Delete  genera  having  no  intluence  on  t*  •/ 

Create  a  LIST  of  gj  e  Hi: 

rapaac 

Examine  g^  €  LIST; 

If  (gj  e  FTi  and  •  f) 
than  delete  gj; 

If  (gj  e  Ai  tJ PCi  and  q;  is  not  called  directly  or  indirectly  by  f) 
than  delete  gj: 

until  (there  are  no  more  a,  €  FT,  c  Mj,  gj  *■  (I  and  (there  are  no  more  gj  € 
Aj  u  PCj  C  Mi  not  called  directly  cr  indirectly  by  £l 


Proposition  2:  Given  a  Structured  Model  M,  and  an  arbitrary  singleton  function  genus 
fe  FTi  c  Mi,  it  exists  a  transformation  T  such  that: 

T(M)  =  SubM(f} 

and  SubM(f)  and  Mi  are  behaviour  equivalent  on  f. 

Proof'.  By  applying  procedure  CREATE_FUNCnON_SUBMODEL  which  defines  the  procedure  Tj 

Let  us  show  how  a  function  genus  f  can  be  reused  as  an  input  parameter  for  other 
models.  This  action  is  totally  automated,  here  is  an  example. 

Suppose  we  have  two  models  Mi  and  M2,  we  want  to  substitute  the  genus  gj  e  A|  c 
Ml  with  the  computed  value  given  by  the  genus  f  e  FT2  c  M2.  This  goal  is  achieved 
applying  the  following  procedure  (the  symbol  (Mi.  SubM2]  means  the  integrated  output 
model): 

proeadura  REUSE  i  input. :  M; .  M’.  gi.  f;  output:  />);,  SubMj)),- 
/*  Integrate  Mi  and  M;.  £  is  substituted  to  g,  ■' 

bagin 

/•  Stap  I  ‘Changes  in  M;'  •. 

CREATE_FUNCTION_SUBMODEL  (Mj.f;  SubMjlf)): 

Create  a  LIST  of  genera  gj  €  A;  c  SubMjIfl; 

rapaat 

Add  the  calling  sequences  segments  ot  gj  €  A]  to  the  calling  sequence  of 
gj  €  Aj  C  SubMjlf  I  ; 
until  end  of  LIST; 

/*  SCap  II  "Changes  in  M*  • / 

Create  a  LIST  of  genera  g,  €  FT  c  NtMi; 
rapaat 

Select  gi  from  LIST: 

If  gi  calls  gj  e  Aj 

than 

Substitute  gj  with  f  in  the  calling  sequence  of  gj; 

LIST  :=  LIST  •  gj; 
until  end  of  LIST 

/•  Step  III  'Delete  attribute  genus*  •/ 

Delete  g-  €  Ai  ; 

and  . 


Proposition  3:  Given  two  Structured  Models  M/  and  M2,  it  is  always  possible  to 
substitute  an  attribute  genus  gj  e  A/  c  Ml  with  a  singleton  function  genus  f  e  FT2C: 
M2  .  The  result  is  a  Structured  Model. 

Proof:  By  applying  the  procedure  REUSE  we  obtain  as  resull  the  model  (M|,  SubM2l.  Iis  graph  of 
genera  must  be  finite,  closed  and  acyclic. 

a)  Finiteness.  Step  ill  guaranties  that  the  number  of  genera  of  (Mi,  SubM2)  is  equal  to  the  number  of 
genera  of  (Mi  \j  SubM2(0)  -  gj- 
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b)  Closure.  By  steps  1  and  II,  there  is  at  least  one  genus  of  Mi  calling  a  genus  of  SubM2(0  and  at  least 
one  genus  of  SubMjfO  calling  a  genus  of  Mi.  From  closure  of  Mi  e  SubM2(0  >(  follows  the  closure  of 
(Ml.  SubM2l. 

c)  Acyclicity.  Let  us  consider  an  arbitrary  sub-set  of  genera  Gi  c  (Mi,  SubM2],  and  let  us  assume  that 
it  is  cyclic.  Therefore.  G,  contains  genera  belonging  to  both  models,  because  no  new  references  ate  set  by 
the  procedure  among  genera  belonging  to  only  one  model.  By  construction,  the  sequence  must  be  of  the 
type: 

1  ....  a;  €  A2  c  SubM2(0.  f.  •••  )• 

The  genus  following  f  in  the  sequence  has  to  be  a  function  genus,  while  the  genus  preceding  aj  has  to  be 
a  compound  entity  genus.  By  Lemma  3  there  are  no  references  among  function  genera,  and  compound 
endty  genera.  Therefote,  G|  cannot  by  cyclic,  m 

Figure  2  shows  how  two  models  arc  integrated. 


M  SubM(0 

Figure  2 


Proposition  4.  Given  two  normal  models  Mi  and  Mi,  the  integrated  model  obtained 
substituting  an  input  parameter  gj  e  Aj  cMp  with  an  output  parameter  gi  e  FT ic  Mi 
is  a  Structured  Model  if  i(gi)  =  ilgi). 

Pro<^'.  It  follows  the  same  line  of  proposition  3.  The  necessary  condition  given  by  the  equality  of  the 
index  functions  ensures  the  closure  and  acyclicity  of  the  graph  of  the  elements.* 

Given  the  result  of  proposition  4  the  following  procedure  can  be  constructed.  The 
input  parameters  are  the  two  normal  models,  an  index  basis  of  the  model  M]  and  a 
function  genus  of  the  model  M2. 

9roe«dur«  USE  (Input:  NlMj!.  NlMjl.  Bi .  f;  Output:  (NtMi),  NIM2)1); 

basis 

Select  aj  €  Bj; 

Comput  e  i  1  ( a  1 ) ; 

Compute  iTif); 

it  ii(ail  e  12(0  then  ezlti 

Create  a  LIST  of  genera  gj  €  FTi  c  N(Mil; 

repeat 

Select  gj  from  LIST; 

if  gi  has  a  reference  to  aj  then 

Substitute  the  reference  to  ai  with  a  reference  to  f  ; 

LIST  LIST  -  gj; 
until  end  of  LIST; 

Delete  Bi; 


and . 
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The  following  steps  create  an  integrated  model,  which  is  the  same  result  as  in 
Geoffrion.  The  final  graph  of  genera  obtained  applying  sequentially  step  0  -  step  IV  is 
shown  in  figure  4. 


se«p  0. 

NORMAL  I  f ini ; 

NORMAL  (mkt); 

NORMAL  (mar); 

NORMAL  (m£g); 

Stap  I. 

SUBSTITUTE  (mkt,  mar,  (P,D11.  (P,D21); 
USE  (mar,  muc ,  1V,D31,  V); 

Stap  II. 

USE  (mfg.  mar,  (V,D5),  V); 

USE  (mar,  mfg,  !E,D4J,  El ; 

Stap  III, 

SUBSTITUTE  (fin,  mar,  (P,D6),  tP,D21); 
USE  (fin,  mar,  [£,081,  E) : 

USE  (fin,  mar,  (V,D91,  V) ; 


Stap  IV, 

MERGE  (mfg,  mar,  P,  U) ; 


fin  mkt  ntsr  mfg 


Figure  4 


3J  Level  3  integration  example. 

At  this  level  of  integration  the  user  needs  to  define  the  steps  to  integrate  the  models,  and 
there  are  no  automated  procedure.  Let  us  present  another  example  extracted  from 
Geoffrion  (4].  The  steps  are  informally  defined,  since  the  user  will  formalize  them. 


Stap  1 

Delete  DEM  and  T:DEM  genera  from  TRANSl 
Delete  SUP  and  T:SUP  genera  from  TRANS2 

Stap  II 

Merge  genus  CUST  from  TRANSl  with  genus 
PLANT  from  TRANS2; 

Stap  111 

Create  new  genera  T:DC  and  define  its 
reference; 

Stap  IV  (Optional) 

Create  a  new  genus  TOTS  being  the  sum  of 
the  TOTS  function  genera  of  the  two 
mode 1 s ; 


TRANSl  TRANS2 


stap  V  (Optional) 

Rename  genera; 


Figure  5 


4.  Conclusions. 

The  first  remark  about  the  definition  of  a  formal  theory  to  models'  integration  is 
modularity.  This  can  be  easily  achieved  projecting  the  theory  of  the  Structured  Modeling 
into  the  same  space  of  the  Object  Orientation  Principles. 

The  second  remark  regards  the  construction  of  three  sub-sets  which  contain  the 
procedures  characterizing  the  formal  rules  of  the  three  integration  levels. 

Both  aspects  will  be  deeper  developed  in  the  future. 
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J2.  Level  2  integration  example. 

In  this  case  the  role  of  the  user  is  relevant,  since  the  input  parameters  to  be  merged  are 
only  identified  by  him. 

Proposition  5.  Given  two  normal  models  N(Mi)  and  N(M2)  and  the  corresponding 
index  basis  set  BSj  and  BS2,  the  integrated  model  obtained  substituting  in  N(M / ),  Bje 
BSj  with  Bic€  BS2  is  a  Structured  Model. 

Proof:  To  substitute  Bj  with  B|c  implies  that  every  genus  gj  e  FT|  has  to  replace  the  reference  in  iiS 
calling  sequence  to  aj  e  Bj  whit  a^  e  B|c.  The  graph  of  genera  of  the  integrated  model  has  to  be:  (a) 
finite:  (b)  closed  and  (c)  acyclic,  (a),  (b)  hold  by  construction:  (c)  hold  by  lemma  2.m 

Proposition  6.  Given  two  normal  models  N(Mj)  and  N(M2),  and  the  corresponding 
index  basis  set  BSj  and  BS2,  the  integrated  model  obtained  substituting,  in  N(Mi},  an 
index  component  cj  e  Bj  e  BSi  with  an  index  component  Cke  B^e  BS2  is  a  Structured 
Model 

Proof:  It  follows  the  lines  of  proposition  5.a 

Given  the  results  of  the  proposition  5  and  6  the  following  procedures  can  be 
constructed. 

procadura  SUBSTITUTE  (Input:  N(Mi),  NiMil.  Ej,  Bn ; 

Output:  (N(Mit,  N(M2I)); 

bagln 

Create  a  LIST  of  genera  gj  €  FTi  c  Mj ; 

repeat 

Select  gj  from  LIST 

Substitute  Ai  €  Bj  with  A2  €  B2  in  Che  calling  sequence  of  gj,- 
LIST  :=  LIST  ■  gii 
until  end  of  LIST: 

Delete  Bi; 

end . 


procedure  MERGE 

begin 

Select  cj. 
Substitute 

and . 


'Input:  N(MjI.  NIMnI,  Bj ,  82; 

-utput:  IN'Mil.  ::iM2l|i; 

ai  €  Bi.  C2  €  B2; 

Cl  with  C2  in  the  calling  sequence  of  ai; 


In  the  next  we  treat  the  core  example  extracted  from  Geoffrion  [4].  The  sub-models  to 
be  integrated  are  shown  in  figure  3  (the  details  are  omitted): 


Figure  3 
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i .  Introduction 

As  pointed  out  by  many  authors,  a  Model  Management  System  (MMS)  provides  for 
creation,  storage,  manipulation,  and  access  to  models.  MMS  functions  can  be  divided  in 
two  main  groups:  Model  storage  functions  and  Model  manipulation  functions.  The 
former  includes  Model  Building,  Model  Representation,  physical  and  logical  Model 
Storage  and  Model  Retrieval;  the  latter  includes  Model  Instantiation,  Interface  with 
Databases,  Model  Maintenance,  Links  between  model  and  Algorithms,  and  Model 
Solving. 

Model  representation  schemes  plays  a  key  role  in  the  implementation  of  effective 
MMSs.  To  fully  implement  the  functions  of  MMSs,  we  need  to  state  a  rigorous 
conceptual  framework  with  a  single  model  representation  leading  to: 

! independence  of  model  represenution  and  model  solution, 

2)  representational  indeptendence  of  general  model  structure  and  detailed  data 
needed  to  describe  specific  model  instances. 

A  system  based  on  these  ideas  would  show  its  usefulness  for  most  phases  of  the 
life-cycle  associated  with  model-based  work  (Geoffrion  1987).  For  example,  consider  a 
mathematical  programming  problem.  Once  a  model  of  this  problem  has  been 
constructed,  MMS  should  allow  the  user  to  perform  the  following  steps: 

1)  select  the  solution  technique  (if  any), 

2)  solve  the  model, 

3)  conduce  sensitivity  analysis. 

To  automate  steps  1  and  2,  the  system  has  to  be  able: 

a)  to  recognize  what  kind  of  model  arises  (so  that  it  could  automatically  select 
the  appropriate  solver); 

b)  to  translate  data  instantiating  the  model  (querying  the  Database  where  they 
are  stored)  into  the  format  required  by  the  selected  solver. 

This  paper  will  focus  on  the  model  recognition  phase.  We  will  try  to  give  its 
theoretical  foundations  and  to  define  which  conditions  a  model  definition  language  has 
to  satisfy  so  that  the  resulting  representation  is  “recognizable". 

Our  formalization  of  the  recognition  process  is  based  on  the  concept  of  “minimal 
representation”.  A  representation  of  a  model  is  minimal  if  any  other  equivalent 
representation  of  the  same  model  can  be  “reduced"  to  it. 
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2.  Model  Recognition  Problem:  Preliminary  Results. 

The  aim  of  this  section  is  to  provide  for  some  formal  definitions.  In  the  next,  we  will 
use  them  to  illustrate  how  recognition  process  can  be  carried  out 

The  recognition  process  we  are  trying  to  formalize  is  based  on  the  concept  of 
minimal  representation.  A  representation  of  a  model  is  minimal  if  any  other  equivalent 
representation  of  the  same  model  can  be  reduced  to  it. 

In  the  rest  of  this  paper  we  will  define  and  explain  minimality,  equivalence  an:' 
reduction  of  model  representations;  first  we  need  to  define  what  we  intend  for  ‘  model’ 
and  “model  representation”. 

Definition  1 

We  define  the  system  M  to  be  a  model  ot  the  system  P  if ; 

—  M  does  not  interact  neither  directly  nor  indirectly  with  P 
—  M  is  used  to  obtain  inlbnnation  about  P 

—  M  comprises  all  the  elements  of  P  relevant  for  the  intended  purpose  of  tne  model. 


Definition  2 

Given  a  formal  language  L  and  a  model  we  define  L(MJ  to  be  its  formal  representation  under  L. 
if  it  comprises  the  expression  in  language  L  of  aU  the  elements  of  Mj  and  ot  the  interactions  existing 
among  them. 

In  the  following  we  will  use  the  terms  “model  representation”  or  simply 
“representation”  to  indicate  the  “formal”  representation  of  a  given  model  under  some 
formal  language. 

Let  us  consider,  as  an  example,  model  Mj  as  the  model  of  the  system  P  that  computes 
the  mean  of  a  given  scries  of  values  belonging  to  P;  if  L  is  the  standard  algebrai~ 
notation,  then  L(Mi)  will  be; 

n 

£  X, 

mean  *  [1] 

If  L(Mi)  exists  and  is  unique,  then  the  recognition  problem  has  a  trivial  solution, 
because  there  is  a  1:1  correspondence  between  model  and  its  representation. 
Unfortunately,  except  for  very  few  cases,  the  model  M;  has  many  representations 

L/Mi).  j=l . n,  n>I.  Referring  to  the  previous  example,  two  other  ways  to  represent 

the  same  model  are  the  following  ones: 
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sum 


mean 


sum 

n 


[2] 


H  yk 

k=l 

result  =  — - -  [3] 


It  is  intuitive  that  all  previous  representations  are  equivalent,  in  so  far  as  they  “do  the 
same  thing”.  Nevertheless,  for  our  purposes  we  need  a  more  rigorous  definition  of 
equivalence  based  on  the  concept  of  “transformauon  rule”. 

We  can  think  to  a  transformation  rule  as  to  a  function  or  procedure  whose  input  is 
the  whole  model  representation  or  a  part  of  it.  and  whose  output  is  a  new  model 
representation  or  a  part  of  it.  Obviously,  the  output  of  a  transformation  rule  must  be 
semantically  consistent  with  its  input.  Let  us  give  its  formal  definition; 

Definition  3 

Consider  a  tormal  language  L  and  two  distinct  sets  E,  and  E2  of  expressions  of  L  semantically 
identical.  Let  R  be  the  set  of  all  transformation  rules  defined  on  L;  r  e  R  is  defined  to  be  a 
transfonnation  rule  on  L  if  applied  to  transforms  it  mto  £3. 

The  existence  of  transformation  rules  is  very  important  to  state  formally  the 
equivalence  of  model  representations.  Two  equivalent  representations  must  be 
semantically  identical;  in  other  words,  there  exist  two  (sets  oO  transformation  rules  that 
transform  one  into  the  other,  and  viceversa.  We  can  formalize  the  equivalence  between 
model  representations  as  follows: 

Definition  4 

Let  Sl  °  { Lj(Mj);  j>l . n;  n>l I  be  the  set  of  all  possible  representation  of  in  the  language  L.  Two 

representations  Lj(MJ,  L|,(M^  e  S). ,  jr4(,  are  defined  to  be  equivalent  if  there  are  two  sets  of 
transformation  rules,  Ri  and  R;,  defined  on  L  such  that  R^  applied  to  LjlMJ  transform  it  into  LkfMJ, 
and  R2  applied  to  L((Mi)  eanstorm  it  into  Lj[MJ.  If  R^  >  R2  then  the  representations  are  defined 
Identical.  Obviously,  identical  representations  are  also  equivalent. 

As  an  example,  let  us  consider  two  transformation  rules,  called  split  andyoin  suitable 
to  be  applied  to  representations  [1]  and  [2].  The  terms  LHS  and  RHS  stand  for 
respectively  “left  hand  side”  and  “right  hand  side”. 

iransrormalioa  split 
input 

in.lraciion  type  fraction 
output 

oui.assignment  type  assignment  statement 
cut  .fraction  type  fraction 
begin 

set  RHS  of  in.assignment  to  numerator  of  in.fractkm 
set  numerausr  oTout.fraction  to  LHS  of  oui.assignment 
set  denominator  of  out.fraction  to  denominator  of  in.fraction 
end 
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iraBsfonnalioD  join 
lapiu 

in.assignmeni  type  usignmem  (Utement 
in_fraccion  type  fraction 
output 

out_fraction  type  fraction 
begin 

if  LHS  of  in.aasignment  =  numeraior  of  in.fraclion  then 
exit  join 

set  numeraior  of  out_fraction  to  RHS  of  in.assignment 
set  denominator  of  out_fractian  to  denomuiaior  of  in.fraciion 

end 

The  rule  split  performs  the  following  operations:  given  a  fraction,  it  reads  its 
numerator  and  assigns  it  to  an  intermediate  variable,  and  then  it  builds  another  fraction 
whose  numerator  and  denominator  an  respectively  the  intermediate  variable  and  the 
denominator  of  the  given  fraction. 

The  rule  join  acts  as  follows:  given  an  assignment  statement  and  a  fraction  whose 
numerator  is  the  variable  on  the  left  hand  side  of  the  assignment  statement,  it  builds  a 
new  fraction  whose  numerator  and  denominator  are  respectively  the  right  hand  side  of 
the  assignment  statement,  and  the  denominator  of  the  of  the  given  fraction. 

Since  we  can  transform  representation  (1]  into  representation  (2)  and  vice  versa  by 
applying  respectively  transformauon  rules  split  and  join,  they  are  equivalent  in  the 
sense  expressed  in  Definition  3.  They  are  not  identical,  since  transformation  rules  we 
need  to  apply  are  different. 

Let  us  now  consider  a  third  rule,  called  rename,  which  renames  all  the  elements  of  a 
model  definition,  or  a  part  of  them,  subject  to  the  simple  constraint  that  all  elements 
with  identical  name  in  the  input  model  representation  must  have  identical  name  in  the 
output  one.  Model  representation  (3]  is  one  of  the  possible  results  of  applying  nile 
rename  to  (!].  Since  transformation  rule  we  need  to  apply  to  transform  representation 
[1]  into  representation  (3)  and  vice  versa  is  the  same,  they  are  identical. 

3.  Model  Recognition  Problem:  Basic  Ideas. 

As  asserted  in  first  section  of  this  paper,  our  main  task  is  to  determine  which 
conditions  have  to  be  satisfied  so  that  the  recognition  of  a  model  can  be  performed.  For 
this  purpose,  we  state  that  the  language  L  must  allow  that  the  set  of  model 
representations  it  produces  can  be  ordered  by  rank.  The  rank  is  a  measure,  defined  on 
some  measurable  aspect  of  L,  which  allows  to  class  and  order  model  representations. 
We  formalize  that  as  follows: 

Definition  5 

A  formal  language  L  satisfies  ihe  property  of  rankabiUty  if: 

—  all  model  representations  L)(M^  e  S).  are  equivalent: 

— all  model  representations  Lj(M  j  c  Si.  can  be  ranked 

—  Sl  can  be  partitioned  by  rank  and  all  the  elements  in  the  same  cell  of  the  partition  are 

identical. 

In  previous  examples  we  might  consider  the  number  of  equations  as  rank.  If  so,  then 
representation  (1]  and  [3]  are  of  rank  1  while  representation  [2]  is  of  rank  2.  Since  all 
representations  are  equivalent,  and  representations  [1]  and  [3]  are  identical,  then  the 
property  of  rankability  holds. 

Now,  let  us  explain  how  recognition  process  can  be  carried  on.  To  recognize  a  model 
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representation  means  that  we  have  to  determine  the  model  it  represents.  The  basic  idea 
of  this  process  is  to  transform  the  representation  to  recognize  into  another  one  that  we 
know  the  kind  of  model  it  represents.  So  doing,  we  have  “recognized”  the  model. 

If  the  representation  we'  deal  with  are  expressed  in  a  language  L  satisfying 
rankability,  then  all  representations  of  the  same  model  are  equivalent.  So,  if  we  know 
all  transformation  rules  that  language  L  allows,  then  we  can  recognize  any 
representation  simply  transforming  it  into  the  known  one  by  applying  to  it  the 
appropriate  transformation  rules. 

The  set  of  all  transformation  rules  may  be  incredibly  large  or  even  not  finite.  This 
fact  can  influence  the  efficiency  of  the  recognition  process.  The  recognition  process  can 
be  carried  out  more  efficiently  if  it  is  based  on  the  ideas  of  “minimal  representation" 
and  of  “reduction  rule”  defined  as  follow: 

DefinitkMi  6 

A  model  representation  Lj(MJ  is  defined  to  be  minimal  if; 

—  L  satisfies  the  property  of  rankability; 

— it  has  the  lowest  posstole  rank. 


Definition  7 

Given  a  language  L  satisfying  rankability,  reduction  rules  are  defined  to  be  bansformation  mies 

which  when  applied  to  a  model  representation  L«(MJ  e  S(.  of  rank  k  produce  a  model  representation 

LjlM^e  St.otranky<k. 

Referring  to  previous  example,  we  can  consider  representations  [1]  and  [2]  as 
minimal  ones. 

Property  of  rankability  plays  a  crucial  role  for  our  purposes;  in  fact,  if  L  satisfies 
rankability,  all  reduction  rules  are  known,  and  they  form  a  finite  set  then; 

—  it  always  admit  a  minimal  representation  (i.e.  a  representation  which  has  the 
lowest  possible  rank,  and  to  which  any  other  representation  of  the  same  model 
can  be  reduced); 

—  any  model  representation  in  language  L  can  be  reduced  in  its  minimal  form 
(by  applying  to  it  the  appropriate  reduction  rule  until  no  more  rule  can  be 
applied); 

—  all  minimal  representations  of  the  same  model  are  identical. 

Under  the  above  mentioned  condition,  the  recognition  process  of  a  given  model 
representation  can  be  based  on  the  minimal  representation  by  performing  the  following 
basic  steps; 

1)  reduce  the  model  representation  to  recognize  to  its  minimal  form; 

2)  search  among  the  “known”  minimal  model  representation  for  a  template 
matching  the  minimal  representation  obtained  by  step  1. 

Since  for  any  given  language  L,  the  set  of  the  reduction  rules  must  necessarily  be  a 
subset  of  the  set  of  the  transformation  rules,  the  recognition  process  of  a  given  model 
based  on  the  minimal  representation  is  more  efficient  than  the  previous  one. 

Now,  we  can  define  formally  the  condition  under  which  a  given  model  definition 
language  generates  “recognizable"  model  representations: 
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Definition  8 

A  model  representation  is  defined  to  be  recognizabie  if  the  recognition  process; 
— can  be  based  on  its  minimal  representation 
— can  be  performed  in  a  finite  number  of  steps. 


Claim  1 

A  fonnal  model  definition  language  L  generates  «recognizable»  model  representations  if: 

—  it  satisfies  the  property  of  rankabibty, 

—  the  set  of  all  reduction  rules  it  admits  is  finite. 

Proof: 

If  L  satisfies  property  of  rankability  then  it  always  admit  a  minimal  representation.  If  the  set  of  the 
reduction  rules  is  finite  any  model  repmsentation  can  be  reduced  to  its  minimal  fonn  in  a  finite 
number  of  steps.  In  this  way  both  the  conditions  which  state  the  recognizability  of  a  model 
representation  are  satisfied.  ■ 


3.  Conclusions 

Here  we  have  sketched  the  fundamental  lines  to  “recognize”  models  representation. 
It  seems  to  us  that  the  idea  of  minimality  looks  very  promising  to  be  further 
investigated. 
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Extended  Abstract 

This  paper  describes  a  simple  methodology  for  reasoning  about  temporal 
and  precedence  constraint  satisfiability  problems  arising  in  job  scheduling.  In 
particular,  a  Constraint  Satisfaction  Problem  (CSP)  approach  is  presented. 

Several  researchers,  coming  both  from  Artificial  Intelligence  (AI)  and 
Operations  Research  (OR)  have  investigated  methods  for  dealing  efficiently 
with  time  (see,  e.g.,  [2, 3, 7, 12]);  however,  at  least  to  the  author’s  knowledge, 
only  very  few  real  and  large  scale  scheduling  applications  have  been 
approached  using  this  relatively  new  technique  [4]. 

In  this  paper,  among  all  the  job  scheduling  problems,  an  application  in 
which  a  set  V  of  n  jobs  has  to  be  processed  on  a  single  machine  is  considered, 
such  that  a  release  date  n,  a  deadline  di  and  a  process  time  pi  are  associated 
with  each  job  i  E  V.  The  problem  is  formulated  on  a  constraint  network,  i.e., 
a  digraph  G  =  (V,A)  of  n  nodes  O'obs).  An  arc  (ij)  E  A  means  that  job  j  can 
be  processed  immediately  after  job  i.  A  weight  pj  and  the  attributes  rj  and  dj 
for  each  node  j  E  V  are  given.  Moreover,  a  digraph  P  =  (V,E),  with  E  C  A, 
is  given  such  that  an  arc  (iJ)  E  E  represents  a  precedence  constraint  between 
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jobs  i  and  j.  The  problem  consists  of  determining  the  starting  time  for 
processing  the  jobs  in  V  such  that  the  time  windows  (defined  by  rj  and  dj)  for 
scheduling  the  execution  of  each  node  (job)  is  satisfied  and  the  precedence 
constraints  between  nodes  given  by  the  relationships  defined  in  arc  set  E  are 
satisfied  within  a  time  horizon  (production  plan). 

Based  on  the  Allen’s  model  for  temporal  logic  [1],  a  CSP  formulation  is 
first  presented.  A  CSP  consists  of  a  set  of  variables  X  =  {xi,  X2, xn  },  their 
associated  domains  Di,  Dz, Da  and  a  set  C  of  constraints  on  these  variables. 
A  solution  to  a  CSP  consists  of  an  instantiation  of  all  the  variables  which  does 
not  violate  any  of  the  constraints.  In  the  case  of  the  application  considered  in 
this  paper,  let  X  be  the  set  of  variables  such  that  xi  represents  the  starting  time 
for  processing  job  i,  V  i  G  V.  A  domain  Di  is  associated  with  each  variable  xi 
such  that  Di  =  {  set  of  available  Time  Machine  Units  (TMUs)  for  processing 
job  i  (production  plan)  }.  The  set  C  of  constraints  is  defined  by  two  classes  of 
constraints,  namely  Ci  and  C2,  such  that  C  =  Ci  U  C2,  Ci  =  {  unary  constraints 
(time  interval) }  =  {riViGV}U{diViGV}  and  C2  =  {  binary  constraints 
(precedences)  }  =  {  (ij)  G  E  }.  The  problem  is  to  verify  whether  an 
instantiation  of  all  the  variables  is  possible  such  that  all  the  jobs  are  completed 
within  their  time  interval  and  no  precedence  relationship  is  violated. 

Starting  from  the  Allen’s  interval  algebra,  the  temporal  relations  are 
specified  by  atomic  relations.  In  particular,  for  each  pair  ij  of  jobs  the  following 
atomic  relations  are  defined: 

-  After(j,i) :  this  specifies  the  precedence  relationship  between  i  and  j,  i.e., 

(U)  e  E; 

-  Available(i,ri,Di):  this  specifies  the  release  date  of  job  i  within  the 
production  plan; 
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-  Due(i,di,Di):  this  specifies  the  deadline  of  job  i  within  the  production 
plan. 

The  constraint  network  G  of  this  problem  is  then  "preprocessed"  such  that 
to  compute  the  tightest  possible  bound  for  both  unary  and  binary  constraints 
on  the  jobs.  In  particular,  given  the  explicit  precedence  relationships  between 
jobs  the  possibility  of  inferring  additional  implicit  precedence  relationships  are 
explored;  for  instance,  the  transitivity  of  the  predicate  After(j,i)  may  allow  to 
infer  information  such  that 

-  After(j,k)  n  After(k,i) After(j,i)  . 

Moreover,  the  availability  interval  of  each  job  within  the  production  plan 
is  computed  by  considering  its  release  date,  deadline  and  precedence 
relationships.  The  new  domain  Di’  for  each  job  i  in  V  is  hence  computed  such 
that  the  predicate 

-  Di’  =  Interval(i,ri,di)  =  Di  D  Avculable(i,ri,Di)  H  Due(i,di,Di) 

returns  the  restricted  time  interval  in  which  each  job  has  to  be  processed 
in  order  to  obtain  a  feasible  scheduling  of  the  jobs.  Note  that  all  the  possible 
instantiations  of  the  corresponding  variables  are  thus  noticeably  reduced  after 

the  computation  of  Di*,  V  i  e  V. 

> 

It  is  worth  mentioning  that  such  a  preprocessing  approach  allows  for 
further  generalization  of  the  proposed  scheduling  problem;  for  instance,  it 
could  be  necessaiy  to  take  into  account  a  possible  decomposition  of  the  jobs 
into  different  subtasks  [13],  to  analyze  periodic  scheduling  problems  [9]  or  to 
consider  setup  times  between  jobs  [S]. 
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A  Prolog-like  algorithm  is  then  presented  for  finding  a  consistent 
assignment  for  the  variables,  i.e.,  an  instantiation  of  all  the  variables  which  does 
not  violate  any  of  the  constraints  given  by  both  Ci  and  C2.  In  particular,  the 
procedure 

-  xi  =  Assign(i,Di) 


associates  a  value  in  the  new  domain  Di’  with  the  corresponding  variable 
xi,  such  that  a  feasible  starting  time  for  processing  job  i  is  givea 

In  this  phase,  following  the  most-constrained  approach  suggested  in  [12], 
the  job  having  the  tightest  constraints  is  selected  first.  In  particular,  the 
procedure 

-  Preorder(X) 

performs  a  sort  of  the  set  of  variables  in  such  a  way  that  the  most  critical 
job,  i.e.,  the  most  constrained  job,  is  chosen  first  for  its  instantiatioa 

In  this  particular  application  the  most  constrained  path  is  proven  to  be  the 
most  efficient  implementative  approach,  in  the  sense  that  the  number  of 
backtrackings  is  minimized  (see,  e.g.,  [6,  7]  for  an  overview  of  the  complexity 
of  this  kind  of  temporal  CSP  problem). 

Note  that  a  different  way  for  finding  a  feasible  instantiation  of  all  the 
variables  is  to  look  for  an  initial  solution,  possibly  inconsistent,  and  then 
incrementally  repair  constraint  violations  until  a  consistent  assignment  is 
achieved.  Such  an  approach  is  proposed  in  [10]  in  the  case  of  scheduling 
problems  without  precedence  and  time  window  constraints. 
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The  application  field  and  computational  experiences  related  to  real-life 
cases  are  also  given  in  the  full  paper.  Some  conclusions  along  with  a 
comparison  with  a  more  traditional  mathematical  programming  approach  (see, 
e.g.  [5,  8])  for  solving  the  scheduling  problem  under  consideration  are  finally 
derived. 
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